How to Write a Checker¶
You can find some simple examples in the distribution (custom.py , custom_raw.py and deprecation_checker.py).
There are three kinds of checkers:
Raw checkers, which analyse each module as a raw file stream.
Token checkers, which analyse a file using the list of tokens that represent the source code in the file.
AST checkers, which work on an AST representation of the module.
The AST representation is provided by the astroid
library.
astroid
adds additional information and methods
over ast
in the standard library,
to make tree navigation and code introspection easier.
Writing an AST Checker¶
Let's implement a checker to make sure that all return
nodes in a function
return a unique constant.
Firstly we will need to fill in some required boilerplate:
import astroid
from astroid import nodes
from typing import TYPE_CHECKING, Optional
from pylint.checkers import BaseChecker
if TYPE_CHECKING:
from pylint.lint import PyLinter
class UniqueReturnChecker(BaseChecker):
name = "unique-returns"
msgs = {
"W0001": (
"Returns a non-unique constant.",
"non-unique-returns",
"All constants returned in a function should be unique.",
),
}
options = (
(
"ignore-ints",
{
"default": False,
"type": "yn",
"metavar": "<y or n>",
"help": "Allow returning non-unique integers",
},
),
)
So far we have defined the following required components of our checker:
- A name. The name is used to generate a special configuration
section for the checker, when options have been provided.
- A message dictionary. Each checker is being used for finding problems
in your code, the problems being displayed to the user through messages. The message dictionary should specify what messages the checker is going to emit. See Defining a Message for the details about defining a new message.
We have also defined an optional component of the checker. The options list defines any user configurable options. It has the following format:
options = (
("option-symbol", {"argparse-like-kwarg": "value"}),
)
The
option-symbol
is a unique name for the option. This is used on the command line and in config files. The hyphen is replaced by an underscore when used in the checker, similarly to how you would useargparse.Namespace
:if not self.linter.config.ignore_ints: ...
Next we'll track when we enter and leave a function.
def __init__(self, linter: Optional["PyLinter"] = None) -> None:
super().__init__(linter)
self._function_stack = []
def visit_functiondef(self, node: nodes.FunctionDef) -> None:
self._function_stack.append([])
def leave_functiondef(self, node: nodes.FunctionDef) -> None:
self._function_stack.pop()
In the constructor we initialise a stack to keep a list of return nodes
for each function.
An AST checker is a visitor, and should implement
visit_<lowered class name>
or leave_<lowered class name>
methods for the nodes it's interested in.
In this case we have implemented visit_functiondef
and leave_functiondef
to add a new list of return nodes for this function,
and to remove the list of return nodes when we leave the function.
Finally we'll implement the check.
We will define a visit_return
function,
which is called with an .astroid.nodes.Return
node.
We'll need to be able to figure out what attributes an
.astroid.nodes.Return
node has available.
We can use astroid.extract_node
for this:
>>> node = astroid.extract_node("return 5")
>>> node
<Return l.1 at 0x7efe62196390>
>>> help(node)
>>> node.value
<Const.int l.1 at 0x7efe62196ef0>
We could also construct a more complete example:
>>> node_a, node_b = astroid.extract_node("""
... def test():
... if True:
... return 5 #@
... return 5 #@
... """)
>>> node_a.value
<Const.int l.4 at 0x7efe621a74e0>
>>> node_a.value.value
5
>>> node_a.value.value == node_b.value.value
True
For astroid.extract_node
, you can use #@
at the end of a line to choose which statements will be extracted into nodes.
For more information on astroid.extract_node
,
see the astroid documentation.
Now we know how to use the astroid node, we can implement our check.
def visit_return(self, node: nodes.Return) -> None:
if not isinstance(node.value, nodes.Const):
return
for other_return in self._function_stack[-1]:
if node.value.value == other_return.value.value and not (
self.linter.config.ignore_ints and node.value.pytype() == int
):
self.add_message("non-unique-returns", node=node)
self._function_stack[-1].append(node)
Once we have established that the source code has failed our check,
we use ~.BaseChecker.add_message
to emit our failure message.
Finally, we need to register the checker with pylint.
Add the register
function to the top level of the file.
def register(linter: "PyLinter") -> None:
"""This required method auto registers the checker during initialization.
:param linter: The linter to register the checker to.
"""
linter.register_checker(UniqueReturnChecker(linter))
We are now ready to debug and test our checker!
Debugging a Checker¶
It is very simple to get to a point where we can use pdb
.
We'll need a small test case.
Put the following into a Python file:
def test():
if True:
return 5
return 5
def test2():
if True:
return 1
return 5
After inserting pdb into our checker and installing it, we can run pylint with only our checker:
$ pylint --load-plugins=my_plugin --disable=all --enable=non-unique-returns test.py
(Pdb)
Now we can debug our checker!
Note
my_plugin
refers to a module called my_plugin.py
.
The preferred way of making this plugin available to pylint is
by installing it as a package. This can be done either from a packaging index like
PyPI
or by installing it from a local source such as with pip install
.
Alternatively, the plugin module can be made available to pylint by
putting this module's parent directory in your PYTHONPATH
environment variable.
If your pylint config has an init-hook
that modifies
sys.path
to include the module's parent directory, this
will also work, but only if either:
the
init-hook
and theload-plugins
list are both defined in a configuration file, or...the
init-hook
is passed as a command-line argument and theload-plugins
list is in the configuration file
So, you cannot load a custom plugin by modifying sys.path
if you
supply the init-hook
in a configuration file, but pass the module name
in via --load-plugins
on the command line.
This is because pylint loads plugins specified on command
line before loading any configuration from other sources.
Defining a Message¶
Pylint message is defined using the following format:
msgs = {
"E0401": ( # message id
"Unable to import %s", # template of displayed message
"import-error", # message symbol
"Used when pylint has been unable to import a module.", # Message description
{ # Additional parameters:
# message control support for the old names of the messages:
"old_names": [("F0401", "old-import-error")]
"minversion": (3, 5), # No check under this version
"maxversion": (3, 7), # No check above this version
},
),
The message is then formatted using the args
parameter from add_message
i.e. in
self.add_message("import-error", args=module_we_cant_import, node=importnode)
, the value in module_we_cant_import
say patglib
will be interpolled and the final result will be:
Unable to import patglib
The
message-id
should be a 4-digit number, prefixed with a message category. There are multiple message categories, these beingC
,W
,E
,F
,R
, standing forConvention
,Warning
,Error
,Fatal
andRefactoring
. The 4 digits should not conflict with existing checkers and the first 2 digits should consistent across the checker (except shared messages).The
displayed-message
is used for displaying the message to the user, once it is emitted.The
message-symbol
is an alias of the message id and it can be used wherever the message id can be used.The
message-help
is used when callingpylint --help-msg
.
Optionally message can contain optional extra options:
The
old_names
option permits to change the message id or symbol of a message without breaking the message control used on the old messages by users. The option is specified as a list of tuples (message-id
,old-message-symbol
) e.g.{"old_names": [("F0401", "old-import-error")]}
. The symbol / msgid association must be unique so if you're changing the message id the symbol also need to change and you can generally use theold-
prefix for that.The
minversion
ormaxversion
options specify minimum or maximum version of python relevant for this message. The option value is specified as tuple with major version number as first number and minor version number as second number e.g.{"minversion": (3, 5)}
The
shared
option enables sharing message between multiple checkers. As mentioned previously, normally the message cannot be shared between multiple checkers. To allow having message shared between multiple checkers, theshared
option must be set toTrue
.
Parallelize a Checker¶
BaseChecker
has two methods get_map_data
and reduce_map_data
that
permit to parallelize the checks when used with the -j
option. If a checker
actually needs to reduce data it should define get_map_data
as returning
something different than None
and let its reduce_map_data
handle a list
of the types returned by get_map_data
.
An example can be seen by looking at pylint/checkers/similar.py
.
Testing a Checker¶
Pylint is very well suited to test driven development. You can implement the template of the checker, produce all of your test cases and check that they fail, implement the checker, then check that all of your test cases work.
Pylint provides a pylint.testutils.CheckerTestCase
to make test cases very simple.
We can use the example code that we used for debugging as our test cases.
import astroid
import my_plugin
import pylint.testutils
class TestUniqueReturnChecker(pylint.testutils.CheckerTestCase):
CHECKER_CLASS = my_plugin.UniqueReturnChecker
def test_finds_non_unique_ints(self):
func_node, return_node_a, return_node_b = astroid.extract_node("""
def test(): #@
if True:
return 5 #@
return 5 #@
""")
self.checker.visit_functiondef(func_node)
self.checker.visit_return(return_node_a)
with self.assertAddsMessages(
pylint.testutils.MessageTest(
msg_id="non-unique-returns",
node=return_node_b,
),
):
self.checker.visit_return(return_node_b)
def test_ignores_unique_ints(self):
func_node, return_node_a, return_node_b = astroid.extract_node("""
def test(): #@
if True:
return 1 #@
return 5 #@
""")
with self.assertNoMessages():
self.checker.visit_functiondef(func_node)
self.checker.visit_return(return_node_a)
self.checker.visit_return(return_node_b)
Once again we are using astroid.extract_node
to
construct our test cases.
pylint.testutils.CheckerTestCase
has created the linter and checker for us,
we simply simulate a traversal of the AST tree
using the nodes that we are interested in.