Atlas is a state-of-the-art platform for building fast, highly-scalable solutions to software analysis problems. Atlas has been used to analyze large-legacy software ranging in tens of millions of lines of code. Application areas include automatic porting, binary analysis, malware detection, and minimal patch generation. ENSOFT has used Atlas to build custom solutions for the US Department of Defense, as well as commercial clients.
Atlas is a platform for software analysis using graph schema. Below, we will demonstrate how to check a memory safety property using Atlas.
The code shown to the right is from XINU, an embedded operating system. In the function dswrite, a memory is allocated on line 21 using the function getbuf. But a corresponding deallocation, using the function freebuf in XINU, is missing. When an allocated memory is not deallocated, it is a memory safety violation and is called memory leak. Is there a memory leak in XINU?
Static analysis tools perform inter-procedural analysis to compute the call chains on screen and find that on some execution paths, the memory is deallocated in the function dskqopt but not on all execution paths. Thus, the tools conclude there is a memory leak in XINU and recommend deallocating memory at the end of dswrite. This fix is problematic as it violates the XINU design, which takes care of the issue using interrupts. A correct analysis should also capture a function which is invoked by an interrupt and deallocates the memory. This is a valid implementation of the producer-consumer pattern and there is in fact no memory leak. Let’s dive in and find this missing function using Atlas.
Analysis in Atlas is done by interacting with the graph representation. Atlas provides a query language to facilitate the interaction and an execution environment to execute the queries. This environment is called the Atlas Shell. You can open the Atlas Shell by clicking Atlas -> Open Atlas Shell. Let’s warm up by locating the code we are interested in: the function dswrite.
The query for locating dswrite is (functions query). Hit enter to execute the query. Using the show query we can see the result of the executed query. It shows a node in the graph that represents the function dswrite. Atlas maintains source-correspondence with the graph. Double-clicking on the graph elements navigates to the code segments represented by the graph elements.
The function dswrite allocates a memory of the type dreq. We will make use of this information to discover the missing function necessary to complete the analysis.
Let’s start by creating some handy variables to represent the allocation function getbuf, the deallocation function freebuf, and the type dreq.
First compute all the functions that can potentially alias the memory allocated in dswrite. Let’s see the result of the query. It does capture dswrite as expected but it also captures dsinter, the function invoked by an interrupt. We are interested in a subset of these functions as not all of them can deallocate the memory.
The query to compute the relevant functions and the call chains is a combination of multiple queries. The first query computes the relevant call chains originating at dswrite. The second query computes the relevant call chains invoked by an interrupt. Finally, we combine the two results.
Here are the call chains computed by the query that are necessary to complete the analysis . It shows that dswrite allocates a memory, which is passed through dskenq to dskqopt and is deallocated on some execution paths in dskqopt. The other mechanism for deallocation is via dsinter, which is invoked by an interrupt. It accesses the memory allocated in dswrite using a global alias and deallocates it. The verification can then be completed using a model checker to verify all execution paths captured in these call chains.
Big Data for Code
Atlas is a scalable platform for mining and analyzing semantically rich graphs from software. It’s built on two decades of research into solving complex software engineering problems using graphs. Atlas can be used interactively to explore a large dataset derived from software or programmatically to perform sophisticated automatic analysis.
Track record of success:
Sophisticated malware detection
Automated auditing of safety critical software
Application Modernization
Complete Solution
ENSOFT provides Atlas, as well as engineering services, to help you build a complete solution for mining relevant information. For example, ENSOFT helped a financial services company identify key integration points for an application modernization project.
ENSOFT’s expertise extends from building a dataset from your code, setting up a computing environment (big or small), to developing novel graph algorithms to solve your specific problem. For example, ENSOFT partnered with Iowa State University to develop a complete malware detection solution for the Defense Advanced Research Projects Agency.
Tailored for Software-derived Datasets
Atlas uses a fast graph database and query engine tailored for a software-derived dataset. In contrast to other big data queries that find patterns within K-degree neighborhoods (e.g. “restaurants my friends’ friends gave 4 or more stars” requires a 3-degree query), software queries can span arbitrary degrees. For example, matching an input variable to an output variable in an embedded controller can easily span nodes that are 100 degrees apart.
Scalable
Atlas can handle huge datasets in a commodity cluster, but even on a modern desktop you can work with millions of lines of code.
Multilingual
Atlas was built to work with many programming languages. We provide flag-ship support for Java and C/C++, as well as support for COBOL, Ada, and other industry specific languages. ENSOFT can quickly build support for additional languages to meet your project needs.
Try Atlas Professional
Downloading Atlas
You can download Atlas for use on a workstation (i.e. desktop or laptop). The download is listed at the end of this page. For use on a cluster or cloud deployment, please contact us.
You will need a license key to run Atlas. If you don’t have a key already please request a free Atlas license.
Getting Started with Atlas
You can follow the steps below
· Click the Atlas -> Atlas Smart View menu item.
· Atlas will map your source code, it will take a few seconds to a few minutes for most projects.
·Then click on a field to get a nice graph.
To change the type of graph click on the little down arrow menu selection in the Atlas Smart view and pick a different script. Have fun trying different scripts and clicking in different parts of your code.
Downloading Atlas
Atlas for C
Windows 64-bit
System Requirements for Atlas
Workstation Requirements
Software
Atlas supports Windows 64-bit and Ubuntu 64-bit 18.04 or higher.
Hardware
Atlas will run on most developer workstations purchased in the last five years. However, specific hardware needs will vary depending on the size and complexity of your code and your application area (e.g. understanding code, quality assurance, project estimation, etc.); please consult the tables below.
Cluster Requirements
Before assuming that a cluster is necessary, please consider if a workstation will suffice. A high-end workstation can readily handle 4 million lines of code.
If your project involves more than 4 million lines of code, the specific requirements for deploying Atlas in a cluster will depend on your project needs. Please contact us for a consultation.
General Software Requirements
In general, Atlas can be deployed on any modern 64-bit Linux (or other UNIX flavor) cluster.
General Hardware Requirements
Atlas relies on large, in-memory caches to execute queries quickly. Your cluster should have 12GB of RAM and 12GB of persistent storage per million lines of code. We suggest you use SSDs for persistent storage.
Typically, Atlas is not sensitive to network latency, therefore a commodity gigabit ethernet network is sufficient for most applications. We recommend modern 64-bit processors, however the specific number of cores is largely determined by your project requirements.
Minimum | Typical | Advanced |
---|---|---|
Processor: Core 2 Duo | Processor: Core i5 1.8GHz+ | Processor: Core i7 2.6GHz+ |
RAM: 4GB | RAM: 8GB | RAM: 16GB+ |
Storage: HDD | Storage: SSD or Hybrid Drive | Storage: SSD |
Lines of Code | Code Map Size* | Recommended -Xmx |
---|---|---|
250KLOC | 2GB | -Xmx3072m |
500KLOC | 4GB | -Xmx6144m |
1MLOC | 8GB | -Xmx12288m |
2MLOC | 16GB | -Xmx24576m |
4MLOC | 32GB | -Xmx49152m |
Memory Guidelines
*Actual code map size will vary based on style and complexity of your code.