Skip to content

Python plugin implementation #781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 156 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
156 commits
Select commit Hold shift + click to select a range
6e0abaa
Added python plugin based on dummy and cpp
Oct 12, 2023
11c9ac7
PythonService basic implementations
Oct 12, 2023
9596452
Added PythonService file types
Oct 13, 2023
75224af
Python plugin webgui
Oct 16, 2023
4bf7c7e
Python plugin webgui infotree
Oct 16, 2023
e5a2973
Added Python InfoTree and Menu
Oct 19, 2023
9752433
pythonplugin basic parser
Nov 27, 2023
68ca707
python parser
Nov 30, 2023
4bb2349
pythonplugin model
Dec 4, 2023
a899674
PYName model
Feb 13, 2024
9be66d1
PythonParser parse function
Feb 13, 2024
2f0b3ce
PythonParser parsing a file
Feb 14, 2024
2b07a9b
PYName hash
Feb 14, 2024
bbb0328
Inserting PYName into database
Feb 16, 2024
50c88af
PYName added file_id
Feb 16, 2024
550370c
PythonService queries PYName from database
Feb 16, 2024
5a238e2
Update PyName hash function
Feb 19, 2024
470490c
PythonService find references.
Feb 20, 2024
c46a43e
Added support for Python virtual environments.
Feb 21, 2024
1f4093f
Log parse results.
Feb 22, 2024
a578684
PythonParser prepare input
Feb 22, 2024
02518b8
PythonParser adding more parse statistics.
Feb 23, 2024
eb2a4cf
PythonParser removed venv_config
Feb 23, 2024
cb0969c
PythonParser handle missing modules, added type hint
Feb 23, 2024
aafc8c4
PythonService display properties.
Feb 23, 2024
8a6e8e1
PythonParser Adding imported files to database.
Feb 23, 2024
f179caa
PythonParser module definition should be the module itself.
Feb 23, 2024
a0e7b45
PythonParser Flags for additional syspath.
Feb 26, 2024
aac3d0c
PythonService Fix astNodeID
Feb 27, 2024
b2e2f00
PythonParser multiprocess
Mar 4, 2024
1daabf6
PythonParser multiprocess flag
Mar 5, 2024
b082a68
PythonParser logging
Mar 5, 2024
cc09bc0
PythonParser Fix missing module report
Mar 5, 2024
c13f4d4
Include missing headers
Mar 14, 2024
c5e1b1b
PythonParser simplify parsing finished message
Mar 18, 2024
0a87b2b
Python syntax highlight
Mar 20, 2024
2898a4b
PythonService skip imports from definition
Mar 20, 2024
e71f85c
PythonService order results by line_start
Mar 25, 2024
b4056b3
PythonService Set range for selected node
Mar 25, 2024
72b8689
PythonParser Parent PYName
Mar 26, 2024
6589b75
PythonService Class methods and data members
Mar 26, 2024
2987af7
PythonService Added utility functions
Mar 27, 2024
2fc59f9
PythonService Added support for class references, function local vari…
Mar 27, 2024
ab10b54
PythonService Added reference list type hints, parent node
Apr 22, 2024
ca885c2
PythonService Added parameter references
Apr 22, 2024
4fced18
PythonService Added getAstNodeInfo
Jul 11, 2024
c93004d
PyParser filter Python files before multiprocessing
Jul 25, 2024
23eeb89
PyParser simplify name position info
Aug 20, 2024
e69c29a
PyParser refactor
Aug 20, 2024
6b88323
PyParser PosInfo, ASTHelper
Aug 20, 2024
33ff8e7
PyParser isFunctionCall handle attribute case
Aug 31, 2024
a3bcd21
PythonService this calls
Aug 31, 2024
8ebccbf
Webgui Function call icon
Sep 1, 2024
2adcb64
PythonService Usage - ignore same PYName
Sep 1, 2024
9b37f24
PythonPlugin venv
Sep 3, 2024
e0f8514
ignore .cache
Sep 3, 2024
cbbd832
PyParser Catch jedi errors
Sep 4, 2024
178526f
PythonService less transaction
Sep 5, 2024
42ef5f5
PyParser Highlight venv path error
Sep 5, 2024
e8354f2
PyParser safe environment config
Sep 5, 2024
2ce3538
PyParser Better logging of messages
Sep 5, 2024
7ea2de8
[Diagrams] Function call diagram WIP
Sep 10, 2024
bb7cc86
[PythonService] Function call diagram
Sep 14, 2024
767c621
[Diagrams] Find parent function
Sep 15, 2024
a6ea04b
[PythonService] getSourceText
Sep 15, 2024
0a4a58e
[PyParser] Add all definition import paths
Sep 16, 2024
e243eea
[Diagrams] Make certain nodes unclickable
Sep 17, 2024
50e6f5f
[Diagrams] Function call diagram file paths
Sep 17, 2024
61dd32b
[Diagrams] Fix node decoration types
Sep 17, 2024
38326e3
[PyParser] Use asthelper for is_import
Sep 17, 2024
3b4bd8d
[Diagrams] Fix node decoration types again
Sep 17, 2024
9080542
[Model] Database indexes
Sep 17, 2024
881e3d0
[PythonService] queryNodes, module diagram
Sep 18, 2024
f0058c5
[Diagrams] Refactor, module diagram
Sep 18, 2024
ed018d3
[PyParser] Set a value for entire module nodes
Sep 18, 2024
0f64b3c
[Diagrams] Catch invalid file id exception
Sep 18, 2024
a85cb46
[Diagrams] Remove module nodes
Sep 19, 2024
ea68289
[Webgui] Handle AstNodes in a file diagram
Sep 20, 2024
db09a70
[PythonService] transformReferences, query definitions in file
Sep 20, 2024
a8c6374
[Diagrams] Added imported usage to the module diagram
Sep 20, 2024
a91756c
[Diagrams] Remove duplicate line nodes from module diagram
Sep 20, 2024
226d590
[Diagrams] Usage diagrams
Sep 21, 2024
3dc0fa1
[PythonService] Node not found exceptions
Sep 21, 2024
5c156ca
[PythonParser] Improve database insert
Sep 21, 2024
0c4b073
[PyParser] Make type_hint a config option
Sep 21, 2024
34ef5dd
[PyParser] Make ASTHelper faster
Sep 21, 2024
6460c23
[PyParser] Only process definitions once
Sep 26, 2024
5ab7182
[PythonService] Debug build: ID, REF_ID, DEFINITION property
Sep 30, 2024
51e0535
[PyParser] Only calculate PosInfo once
Oct 1, 2024
c8348be
[PyParser] PYReference, getFileRefs
Oct 1, 2024
9f25689
[PythonService] queryNodeByPosition: prefer shorter node values
Oct 1, 2024
74dd58b
[PyParser] PYBuiltin
Oct 1, 2024
ad63d87
[Diagrams] Simplify module dependency diagram
Oct 1, 2024
bd3c862
[PythonService] Refactor getDiagramTypes
Oct 1, 2024
cd80bbb
[PyParser] PosInfo dataclass
Oct 2, 2024
b02fb17
[PyParser] ParserConfig
Oct 2, 2024
6c27aba
[PyParser] PYReference error handling, stack trace config option
Oct 2, 2024
d72b6ff
[PyParser] module_path should be str
Oct 2, 2024
9db1208
[PyParser] Stack trace warning color
Oct 2, 2024
60ac1c1
[PyParser] PYBuiltin stack trace
Oct 5, 2024
57ccf00
[Diagrams] Use different color for builtin paths
Oct 5, 2024
c725657
[PythonService] queryNodes ORDER BY is_builtin
Oct 5, 2024
889bb58
[PyParser] PYBuiltin rework
Oct 5, 2024
f08cffd
[Diagrams] Fix file path color
Oct 5, 2024
908dba7
[PyParser] PYBuiltin stdlib submodules
Oct 5, 2024
194bda7
[Diagrams] Add a comment explanation to file path nodes
Oct 5, 2024
a865b4b
[Diagrams] Diagram Legend
Oct 5, 2024
1220c4e
[Diagrams] Remove border from certain file path nodes
Oct 6, 2024
ede556b
[Diagrams] Class diagram
Oct 6, 2024
3ca692e
[PyParser] astparam
Oct 6, 2024
ecefcd7
[PyParser] PYName path, move getHashName to parserutil
Oct 9, 2024
8dac26c
[PyParser] NodeInfo, ASThelper path
Oct 9, 2024
a6b12dd
[PyParser] Annotations
Oct 12, 2024
4d726dc
Graphviz Cairo SVG rendering
Oct 12, 2024
45bf725
[Diagrams] Class diagram return annotation, syntax highlight
Oct 12, 2024
97de0b7
[PyParser] getSubclass
Oct 13, 2024
b4dd3bd
[PythonService] Base class
Oct 13, 2024
262d032
[Diagrams] Class diagram base class
Oct 13, 2024
d757ce3
[PyParser] Rewrite definition handling
Oct 15, 2024
7bbb69e
[PyParser] More config options
Oct 15, 2024
856587e
[PyParser] AST function signature
Oct 15, 2024
e339e33
[PyParser] Added more config options
Oct 15, 2024
b926226
[PyParser] log_config
Oct 15, 2024
e9c9501
[PyParser] ParseResult
Oct 15, 2024
c0d02b3
[PyParser] __getPosValue, getFunctionParam
Oct 16, 2024
cb7d213
[Diagrams] Class diagram variable highlight fix
Oct 16, 2024
0c51573
[PyParser] Expand __getASTValue
Oct 16, 2024
7cb12e9
[PythonParser] CLI arguments
Oct 21, 2024
67bb581
[PyParser] Submodule discovery
Nov 16, 2024
509b936
[PythonTest] Parser test
Nov 17, 2024
02ac6b9
[PythonTest] Service test
Nov 17, 2024
0c4d40f
[Webgui] Fix caller infinite grouping
Nov 18, 2024
d02a8ea
[Webgui] Cleanup
Nov 18, 2024
8a03091
[PythonParser] Remove accept function
Nov 18, 2024
ffb2676
[PyParser] Refactor getFunctionSignature
Nov 18, 2024
25e9e71
[PyParser] Fix annotation overwritten by jedi node
Nov 20, 2024
137719f
[PythonTest] Parser test - functions
Nov 20, 2024
08dd0a8
[PythonTest] Parser test - class inheritance
Nov 20, 2024
b8ed093
[PythonTest] Parser test - queryFile
Nov 20, 2024
ed939a3
[PythonTest] Parser test - class methods, local variables
Nov 20, 2024
67ead9f
[PythonTest] Parser test - imports
Nov 20, 2024
8cbea73
[PythonTest] Parser test - builtin variable
Nov 20, 2024
4be4e94
[PythonTest] Parser test - ClassType, ReferenceID
Nov 23, 2024
f42f59e
[PythonTest] Service test - node properties, reference tests
Nov 23, 2024
bd6169e
pycache gitignore
Nov 23, 2024
b0b111d
[PythonService] move boolToString to util
Nov 23, 2024
e322c2b
[PythonService] Cleanup
Nov 23, 2024
df85f87
[PythonService] Make PythonDiagram friend class
Nov 23, 2024
62c93ca
[PythonService] Debug message spacing
Nov 23, 2024
bcb17b0
[PythonParser] parse() string ref
Nov 24, 2024
51dcc77
[PyParser] ASTHelper __equalPos
Nov 24, 2024
706b7e6
[Diagrams] Use relative path
Nov 24, 2024
e560632
[PythonService] Remove getProperties message for release builds
Dec 1, 2024
286ad16
Fix compilation warnings.
Feb 25, 2025
c946ed8
Added file refs option.
Feb 27, 2025
a55f31c
Added Python plugin documentation.
Feb 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ nbproject/
# Vim
*.swp

# clangd cache
.cache/

## Build folders
build/
Expand Down
3 changes: 3 additions & 0 deletions Config.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@ set(INSTALL_GEN_DIR "${INSTALL_SCRIPTS_DIR}/generated")
# Installation directory for Java libraries
set(INSTALL_JAVA_LIB_DIR "${INSTALL_LIB_DIR}/java")

# Installation directory for the Python plugin
set(INSTALL_PYTHON_DIR "${INSTALL_LIB_DIR}/pythonplugin")

# Installation directory for executables
set(INSTALL_BIN_DIR "bin")

Expand Down
123 changes: 123 additions & 0 deletions doc/pythonplugin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Python Plugin

## Parsing Python projects
Python projects can be parsed by using the `CodeCompass_parser` executable.
See its usage [in a seperate document](/doc/usage.md).

## Python specific parser flags

### Python dependencies
Large Python projects usually have multiple Python package dependencies.
Although a given project can be parsed without installing any of its dependencies, it is strongly recommended
that the required modules are installed in order to achieve a complete parsing.
To install a project's dependencies, create a [Python virtual environment](https://docs.python.org/3/library/venv.html)
and install the necessary packages.
When parsing a project, specify the virtual environment path so the parser can successfully resolve the dependencies:
```
--venvpath <path to virtual environment>
```

### Type hints
The parser can try to determine Python type hints for variables, expressions and functions.
It can work out type hints such as `Iterable[int]` or `Union[int, str]`.
However, this process can be extremely slow, especially for functions, thus it is disabled by default.
It can be enabled using the `--type-hint` flag.

### Python submodules
Large Python projects can have internal submodules and the parser tries to locate them automatically.
Specifically, it looks for `__init__.py` files and considers those folders modules.
This process is called submodule discovery and can be disabled using the `--disable-submodule-discovery` flag.

You can also add submodules manually by adding those specific paths to the parser's syspath:
```
--syspath <path>
```
For more information, see the [Python syspath docs](https://docs.python.org/3/library/sys.html#sys.path).

### File references
By default, the parser works out references by looking for definitions only - if nodes share the same definition
they are considered references.
However, this method sometimes misses a few references (e.g. local variables in a function).
To extend search for references in a file context, apply the `--file-refs` flag.
Note that using this option can potentially extend the total parsing time.

## Examples of parsing Python projects

### Flask
We downloaded [flask 3.1.0](https://github.com/pallets/flask/releases/tag/3.1.0) source code to `~/parsing/flask/`.
The first step is to create a Python virtual environment and install flask's dependencies.
Create a Python virtual environment and activate it:
```bash
cd ~/parsing/flask/
python3 -m venv venv
source venv/bin/activate
```
Next, we install the required dependencies listed in `pyproject.toml`.
```bash
pip install .
```
Further dependencies include development packages listed in `requirements/dev.txt`.
These can be also installed using `pip`.
```bash
pip install -r requirements/dev.txt
```
Finally, we can run `CodeCompass_parser`.
```bash
CodeCompass_parser \
-n flask \
-i ~/parsing/flask/ \
-w ~/parsing/workdir/ \
-d "pgsql:host=localhost;port=5432;user=compass;password=pass;database=flask" \
-f \
--venvpath ~/parsing/flask/venv/ \
--label src=~/parsing/flask/
```

### CodeChecker
We downloaded [CodeChecker 6.24.4](https://github.com/Ericsson/codechecker/releases/tag/v6.24.4) source code to `~/parsing/codechecker`.
CodeChecker has an automated way of creating a Python virtual environment and installing dependencies - by running the `venv` target of a Makefile:
```bash
cd ~/parsing/codechecker/
make venv
```
Next, we can run `CodeCompass_parser`.
```bash
CodeCompass_parser \
-n codechecker \
-i ~/parsing/codechecker/ \
-w ~/parsing/workdir/ \
-d "pgsql:host=localhost;port=5432;user=compass;password=pass;database=codechecker" \
-f \
--venvpath ~/parsing/codechecker/venv/ \
--label src=~/parsing/codechecker/
```

## Troubleshooting
A few errors can occur during the parsing process, these are highlighted in color red.
The stack trace is hidden by default, and can be shown using the `--stack-trace` flag.

### Failed to use virtual environment
This error can appear if one specifies the `--venvpath` option during parsing.
The parser tried to use the specified virtual environment path, however it failed.

#### Solution
Double check that the Python virtual environment is correctly setup and its
path is correct.
If the error still persists, apply the `--stack-trace` parser option
to view a more detailed stack trace of the error.

### Missing module (file = path line = number)
In this case, the parser tried to parse a given Python file, however it
could not find a definition for a module.
Commonly, the Python file imports another module and the parser cannot locate this module.
If this happens, the Python file is marked *partial* indicating that
a module definition was not resolved in this file.
The error message displays the module name, exact file path and line number
so one can further troubleshoot this problem.

#### Solution
Ensure that the `--venvpath` option is correctly specified and all the required
dependencies are installed in that Python virtual environment.
If the imported module is part of the parsed project, use the `--syspath` option
and specify the directory where the module is located in.

6 changes: 6 additions & 0 deletions plugins/python/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
add_subdirectory(model)
add_subdirectory(parser)
add_subdirectory(service)
add_subdirectory(test)

install_webplugin(webgui)
10 changes: 10 additions & 0 deletions plugins/python/model/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
set(ODB_SOURCES
include/model/pyname.h
)

generate_odb_files("${ODB_SOURCES}")

add_odb_library(pythonmodel ${ODB_CXX_SOURCES})
target_link_libraries(pythonmodel model)

install_sql()
53 changes: 53 additions & 0 deletions plugins/python/model/include/model/pyname.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#ifndef CC_MODEL_PYNAME_H
#define CC_MODEL_PYNAME_H

#include <cstdint>
#include <string>
#include <odb/core.hxx>

namespace cc
{
namespace model
{

enum PYNameID {
ID,
REF_ID,
PARENT,
PARENT_FUNCTION
};

#pragma db object
struct PYName
{
#pragma db id unique
std::uint64_t id = 0;

#pragma db index
std::uint64_t ref_id;

std::uint64_t parent;
std::uint64_t parent_function;

bool is_definition = false;
bool is_builtin = false;
bool is_import = false;
bool is_call = false;
std::string full_name;
std::string value;
std::string type;
std::string type_hint;

std::uint64_t line_start;
std::uint64_t line_end;
std::uint64_t column_start;
std::uint64_t column_end;

#pragma db index
std::uint64_t file_id;
};

}
}

#endif
1 change: 1 addition & 0 deletions plugins/python/parser/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
venv/
46 changes: 46 additions & 0 deletions plugins/python/parser/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
find_package(Python3 REQUIRED COMPONENTS Interpreter Development)
find_package(Boost REQUIRED COMPONENTS python)

include_directories(
include
${PROJECT_SOURCE_DIR}/model/include
${PROJECT_SOURCE_DIR}/util/include
${PROJECT_SOURCE_DIR}/parser/include
${PLUGIN_DIR}/model/include)

include_directories(SYSTEM
${Boost_INCLUDE_DIRS}
${Python3_INCLUDE_DIRS})

add_library(pythonparser SHARED
src/pythonparser.cpp)

target_link_libraries(pythonparser
model
pythonmodel
${Boost_LIBRARIES}
${Python3_LIBRARIES})

target_compile_options(pythonparser PUBLIC -Wno-unknown-pragmas)

set(VENV_DIR "${PLUGIN_DIR}/parser/venv/")
if(NOT EXISTS ${VENV_DIR})
message("Creating Python virtual environment: ${VENV_DIR}")
execute_process(
COMMAND python3 -m venv venv
WORKING_DIRECTORY ${PLUGIN_DIR}/parser/)
endif()

message("Installing Python dependencies...")
execute_process(
COMMAND venv/bin/pip install -r requirements.txt
WORKING_DIRECTORY ${PLUGIN_DIR}/parser/)

install(TARGETS pythonparser DESTINATION ${INSTALL_PARSER_DIR})
install(
DIRECTORY pyparser/
DESTINATION ${INSTALL_PYTHON_DIR}/pyparser)
install(
DIRECTORY venv/
DESTINATION ${INSTALL_PYTHON_DIR}/venv)

43 changes: 43 additions & 0 deletions plugins/python/parser/include/pythonparser/pythonparser.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#ifndef CC_PARSER_PYTHONPARSER_H
#define CC_PARSER_PYTHONPARSER_H

#include <string>
#include <vector>
#include <map>
#include <parser/abstractparser.h>
#include <parser/parsercontext.h>
#include <parser/sourcemanager.h>
#include <util/parserutil.h>
#include <boost/python.hpp>
#include <model/pyname.h>
#include <model/pyname-odb.hxx>
namespace cc
{
namespace parser
{

namespace python = boost::python;

typedef std::unordered_map<std::uint64_t, model::PYName> PYNameMap;

class PythonParser : public AbstractParser
{
public:
PythonParser(ParserContext& ctx_);
virtual ~PythonParser();
virtual bool parse() override;
private:
struct ParseResultStats {
std::uint32_t partial;
std::uint32_t full;
};

python::object m_py_module;
void processFile(const python::object& obj, PYNameMap& map, ParseResultStats& parse_result);
void parseProject(const std::string& root_path);
};

} // parser
} // cc

#endif // CC_PARSER_PYTHONPARSER_H
Loading
Loading