Introduction
What is Plume?
Plume is a JVM bytecode to code property graph library supported by a graph database storage backend. Given an application compiled to JVM bytecode, the library will analyze the bytecode using an interprocedural control-flow graph (ICFG) produced using Soot. This ICFG is then converted into what is called a code property graph which, at the base level, is simply a combination of an abstract syntax tree (AST), control flow graph (CFG), and program dependence graph (PDG). This three part graph is then persisted in a supported graph database and can be queried and analyzed via tools such as Joern.
Note that Plume produces CPGs ideally used for dataflow tracking. The AST's from using Jimple as the IR lead to fairly degenerated ASTs which make Plume unsuitable for AST analysis.
Optimized in Joern as jimple2cpg
Plume is the original implementation of jimple2cpg. The frontend on the Joern project is optimized around OverflowDB and is much more lightweight. This is project focuses on experimenting with incremental dataflow analysis and comparing database backend performance.
Benefits of using Plume
Plume is an open-source Scala project which provides a type-safe interface for interacting with a graph database constructed using ShiftLeft's (the primary maintainers of Fabian Yamaguchi's Joern) CPG schema.
The idea of storing the CPG in a graph database is motivated by the observation that this approach will allow the analysis to be done incrementally (one does not have to regenerate a graph everytime one wishes to perform analysis and results can be persisted), updates done partially (if one method changes, only that subtree is regenerated), and be scalable for large applications due to how the number of nodes and edges scale for the CPG. Plume supports multiple graph databases so that developers can select a graph database based on their software stack and processing requirements.
Plume provides JVM bytecode support for the Joern project which means that one can use Joern to perform analysis on Plume generated CPGs.
Supported Languages
Since Plume analyzes JVM bytecode, if a language is able to compile to JVM bytecode then Plume can accept it. Plume no longer supports compiling Java source code to bytecode automatically.
Simply load the respective .jar
or .class
files (or a directory or JAR file
containing either) using the Jimple2Cpg::createCpg
method accordingly.
General Plume Benefits
- Choice of graph database: in-memory to dedicated, open-source to enterprise, single node to multi-machine clustered.
- CPG schema strictly enforced in the Plume driver using codepropertygraph domain classes.
- Handles very large code property graphs.
- Analysis can be done incrementally.
- Simple to use interface.
- Open source under the liberal Apache 2 license.
Sponsored by
Where does the name come from?
The word "plume" can describe a plume of smoke, dust, or fire rising into the air in a column in large quantities. Due to the fact that Plume leverages Soot to construct the code property graph, a colleague of mine, Lauren Hayward, suggested it be called Plume as homage and so it was named.