How to use SQLite with Stardog

Created by Steve Place, Modified on Fri, Mar 3, 2023 at 9:39 AM by Steve Place

Why use SQLite?

SQLite's footprint is very small. This is great for development and creating demos, as it will not kill your laptop.
It's easy to use.
It has a CLI.
It's scriptable, which makes starting a project from a CSV file a breeze.

Note: While SQLite is easy to use and convenient, it is not officially supported by Stardog and thus is not recommended for production!

How to install SQLite on your machine

Mac

OSX already comes with SQLite installed. However, if you want to have a more up-to-date version, run:

brew install sqlite3

Linux

Debian-based systems:

apt-get install sqlite3

RPM-based systems:
yum install sqlite

Installing the JDBC driver to Stardog

Go to the sqlite-jdbc release page, scroll down to "Assets", and download the the jar file. The file name will be something like sqlite-jdbc-X.XX.X.X.jar.

Run the following commands:

export STARDOG_EXT=$STARDOG_HOME/lib 
mkdir $STARDOG_EXT
cp sqlite-jdbc-X.XX.X.X.jar $STARDOG_EXT. #adjust version

Restart your Stardog server.

If you are using a docker install, you will need to adjust the start command slightly:

docker run -it -e STARDOG_EXT="/var/opt/stardog/lib" -v $STARDOG_HOME:/var/opt/stardog -p 5820:5820 stardog/stardog

Using SMS2

Now you can create a mapping file that uses SQLite just like you would any other database. This requires a property file. Here is what we recommend:

jdbc.url=jdbc:sqlite:/var/opt/stardog/database.db
jdbc.username=insert_username
jdbc.password=changeit
jdbc.driver=org.sqlite.JDBC
sql.default.schema=main
sql.dialect=POSTGRESQL
sql.skip.validation=true

For the jdbc.url, you can use a relative path, which would look for the database in STARDOG_HOME. We recommend using the absolute path to be explicit about where the database is located.
jdbc.username and jdbc.password can be blank, but they must be in the properties file.
jdbc.driver must be org.sqlite.JDBC.
sql.default.schema=main and sql.skip.validation=true are required.

Note: putting the database file in STARDOG_HOME is simply a recommendation, but it does make life easier.

There is no official sql.dialect, so you may adjust it based on your need. This table tracks dialects and their features:

Dialect	LIMIT	CONCAT	Needs Quotes?
ORACLE	No	Yes	Yes
HIVE	Yes	No	No
MYSQL	No	Yes	No
ATHENA	Yes	Yes	No
POSTGRESQL	Yes	Yes	No

To use stardog-admin virtual add with SQLite, the dialect you choose must have yes in the LIMIT column.
To use SQLite's concat feature (i.e., field1 || ' ' || field2) there must be a yes in the CONCAT column.
The quote column shows if quoting is necessary for column or table names.
Our recommendation is to use PostgreSQL.

Example of usage

Manipulating CSV

In general, SMS2 support for CSV is very easy to use. However, sometimes you are given many CSV files, and you need to filter the data you're loading. One solution could be to load all the data in Stardog in a staging area, drop the data that's not wanted, and move it to your desired area.

The problem is the logic of transforming/dropping the data exists in two locations. Enter SQLite. Loading CSV files is very easy - you can do it via the CLI:

sqlite>.import ~/file.csv cities

or use SQLite Studio IDE,

or script it:

(echo .separator ,; echo .import file1.csv table1) | sqlite3 database.db
(echo .separator ,; echo .import file2.csv table2) | sqlite3 database.db
(echo .separator ,; echo .import file3.csv table3) | sqlite3 database.db
(echo .separator ,; echo .import file4.csv table4) | sqlite3 database.db

Remember to copy the database file into your STARDOG_HOME directory.

Now you have a self-contained SQL database. In the SMS file you can use SQL to select the data you want. You also have access to some transform functions. For example, the CONCAT function works for CSV files in the SMS2 mapping file, but it will not work with SQL.

Once you've tested your query, you can import the data like this:
stardog-admin virtual import example ex.properties ex4.sms

Now you have all your data logic around loading the CSV file encapsulated in one location, your SMS file.

Here is an example:

prefix ex: <https://example.org/>
prefix vcard: <http://www.w3.org/2006/vcard/ns#>
prefix owl: <http://www.w3.org/2002/07/owl#>

MAPPING
FROM SQL {
  SELECT
  first_name || ' ' || last_name AS FullName,
  CASE s.statusCode WHEN 1 THEN 'Full Time' ELSE 'Consultant' END Type,
  *
  FROM employee e
  INNER JOIN status s
  ON e.id = s.customer_id
  WHERE s.statusCode != 0
}
TO {
  ?Employee a ex:Employee;
  ex:employeeId ?employeeId;
  ex:type ?Type;
  ex:fullName ?FullName;
  ex:givenName ?fist_name;
  ex:familyName ?last_name;
  ex:email ?email;
}
WHERE {
  BIND(template("https://example.org/employee#{id}") AS ?Employee )
}

Looking at the mapping file, we can see all the mappings, transforms, and filters that were performed on the CSV dataset, as if we had performed these operations over a relational database.