What is Plume?
Plume is a JVM bytecode to code property graph library supported by a graph database storage backend. Given an application compiled to JVM bytecode, the library will analyze the bytecode using an interprocedural control-flow graph (ICFG) produced using Soot. This ICFG is then converted into what is called a code property graph which, at the base level, is simply a combination of an abstract syntax tree (AST), control flow graph (CFG), and program dependence graph (PDG). This three part graph is then persisted in a supported graph database and can be queried and analyzed via tools such as Joern.
Note that Plume produces CPGs ideally used for dataflow tracking. The AST's from using Jimple as the IR lead to fairly degenerated ASTs which make Plume unsuitable for AST analysis.
Benefits of using Plume
Plume is an open-source Kotlin project which provides a type-safe interface for interacting with a graph database constructed using ShiftLeft's (the primary maintainers of Fabian Yamaguchi's Joern) CPG schema. Since Kotlin is interoperable with Java, one should not have any difficulty incorporating Plume in their Java projects.
The idea of storing the CPG in a graph database is motivated by the observation that this approach will allow the analysis to be done incrementally (one does not have to regenerate a graph everytime one wishes to perform analysis and results can be persisted), updates done partially (if one method changes, only that subtree is regenerated), and be scalable for large applications due to how the number of nodes and edges scale for the CPG. Plume supports multiple graph databases so that developers can select a graph database based on their software stack and processing requirements.
Plume provides JVM bytecode support for the Joern project which means that one can use Joern to perform analysis on Plume generated CPGs.
Since Plume analyzes JVM bytecode, if a language is able to compile to JVM bytecode then Plume can accept it. Plume supports compiling Java source code automatically.
Simply load the respective
.class files (or a directory or JAR file
containing either) using the extractor
General Plume Benefits
- Choice of graph database: in-memory to dedicated, open-source to enterprise, single node to multi-machine clustered.
- CPG schema strictly enforced in the Plume driver using codepropertygraph domain classes.
- Handles very large code property graphs.
- Analysis can be done incrementally.
- Simple to use interface.
- Open source under the liberal Apache 2 license.
Able to project JVM bytecode to CPG Store CPG in various graph databases with database agnostic interface Run
dataflowengineosspasses over Plume CPGs
Implement change detection and incremental analysis
- Benchmark and prove most efficient/fastest/cheapest storage backend
- Perform overhaul around best storage backend for optimized performance
Where does the name come from?
The word "plume" can describe a plume of smoke, dust, or fire rising into the air in a column in large quantities. Due to the fact that Plume leverages Soot to construct the code property graph, a colleague of mine, Lauren Hayward, suggested it be called Plume as homage and so it was named.