What is Soufflé?
Datalog is a logic-based query programming language that allows users to write recursive queries at-ease. Datalog does not have unified specifications for its syntax, so each implementation may differ. Soufflé1 is an open source “state of the art” Datalog engine that uses OpenMP for parallelization in C++. It provides a compiler to translate Datalog programs into efficient C++ code, enabling Soufflé to handle large datasets while maintaining fast execution speeds. Soufflé also offers an interpreter for quick testing and debugging of Datalog programs. In addition to its core functionality, Soufflé offers several advanced features2, including:
- Semi-naïve evaluation: Soufflé utilizes semi-naïve evaluation strategy to efficiently compute the least fixed-point of a set of recursive rules. In the naïve evaluation algorithm, when a new fact is derived, it is used to re-evaluate all the rules that contain that predicate in their bodies, potentially leading to redundant computations. The semi-naïve evaluation algorithm, on the other hand, keeps track of the newly derived facts in each iteration and only uses those new facts to evaluate the rules in the next iteration. This avoids redundant computations by not re-evaluating rules with facts that have already been processed in previous iterations.
- Stratified negation: This allows for reasoning about the absence of facts in the program.
- Aggregation: Soufflé supports various aggregation operators, enabling computations like counting, summing, and averaging data.
- Automatic index selection: Soufflé automatically selects appropriate indexes for relations, improving query performance by efficiently accessing relevant data.
- Specialized parallel data structures: Soufflé utilizes optimized data structures like disjoint-sets, B-trees, and tries to handle specific data relationships efficiently in a parallel setting, further enhancing performance.
- Static typing: Soufflé enforces data types at compile time, ensuring type safety and preventing runtime errors. This also allows for better code optimization.
- Records and algebraic data types: Soufflé supports defining complex data structures using records and algebraic data types, enabling the representation of hierarchical or variant data within Datalog programs.
Run Soufflé on Docker interactive mode
Soufflé can be installed system-wide for popular Linux and Unix based operating systems3. From the official documentation, it is unclear if Soufflé can be used in Windows. Recently, when I tried to use Soufflé on Ubuntu 23.10, I faced the following error:
The following packages have unmet dependencies:
souffle : Depends: libffi7 (>= 3.3~20180313) but it is not installable
E: Unable to correct problems, you have held broken packages.
This inspired me to try running Soufflé on Docker. Follow the previous post to Install Docker on Ubuntu. For Windows users or those facing installation issues on specific Linux distributions, running Soufflé in a Docker container provides a convenient and consistent environment.
Steps to install Soufflé on Ubuntu using Docker
In this tutorial, we will see how to use Soufflé on Docker in interactive mode.
Step 1. Create a Dockerfile
First, we need to create a Dockerfile
that sets up the necessary environment for Soufflé. Save the following content in a file named Dockerfile
:
Dockerfile
:
FROM ubuntu:20.04
# Update package lists and install necessary packages
RUN apt-get update && \
apt-get install -y \
wget \
gnupg \
&& rm -rf /var/lib/apt/lists/*
# Download and add the Souffle repository key
RUN wget -q https://souffle-lang.github.io/ppa/souffle-key.public -O /usr/share/keyrings/souffle-archive-keyring.gpg
# Add the Souffle repository to the sources list
RUN echo "deb [signed-by=/usr/share/keyrings/souffle-archive-keyring.gpg] https://souffle-lang.github.io/ppa/ubuntu/ stable main" | tee /etc/apt/sources.list.d/souffle.list
# Update package lists again and install Souffle
RUN apt-get update && \
apt-get install -y \
souffle
# Set entrypoint to bash
CMD ["/bin/bash"]
Step 2. Create a Sample Datalog Program
In the same directory as the Dockerfile
, let’s create a file named demo.dl
with the following Datalog program:
.decl parent(n: symbol, m: symbol)
.decl ancestor(n: symbol, m: symbol)
.output ancestor
// Facts of parent: Extensional database
parent("john", "bob").
parent("bob", "alice").
parent("alice", "charlie").
// Base rule
ancestor(X, Y) :- parent(X, Y).
// Inductive rule
ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
Step 3. Build the Docker Image
Open a terminal and navigate to the directory containing the Dockerfile
and demo.dl
files. Build the Docker image by running the following command:
docker build -t 'souffle-image' .
Step 4. Run the Docker Container
After the image is built, run the Docker container in interactive mode with the current directory mounted as a volume:
docker run -v $(pwd):/workspace -it --name=souffle-container souffle-image
Step 5. Run Datalog program in interpreter mode
We can execute Datalog programs using Soufflé within the Docker container. Navigate to the
/workspace
directory and run thedemo.dl
program in interpreter mode. The-D-
option instructs Soufflé to display the output relation on the standard output. You should see theancestor
relation printed to the console.root@5b7f16a197c8:/# cd workspace/ root@5b7f16a197c8:/workspace# souffle -D- demo.dl --------------- ancestor n m =============== john bob john alice john charlie bob alice bob charlie alice charlie ===============
Step 6. Run Datalog program in compiler mode
Alternatively, we can also run the program in compiler mode using -o
option of Soufflé, which transforms the Datalog program to C++ code and also create an executable.
The following command generates an executable named demo
that can be run to obtain the output of the ancestor
program on standard output.
root@5b7f16a197c8:/workspace# souffle -D- -o demo demo.dl
root@5b7f16a197c8:/workspace# ls
Dockerfile demo demo.cpp demo.dl
root@5b7f16a197c8:/workspace# ./demo -D-
---------------
ancestor
n m
===============
john bob
john alice
john charlie
bob alice
bob charlie
alice charlie
===============
(Optional) Rerun Docker container or delete existing container
To rerun the
souffle-container
Docker container in interactive mode:docker start -i souffle-container
List and delete Docker container:
# List all Docker containers, including stopped ones docker ps -a # Delete a specific container using its NAME or CONTAINER ID docker rm souffle-container
By following this tutorial, we can install and use Soufflé on Docker in an interactive mode (including interpreter or compiler options), allowing us to explore and leverage the power of Datalog for various applications, such as program analysis, model checking, and deductive databases.
References
Advertisement