Asking Questions of CSV Data, Using SQL In the Browser, With Franchise

Notebook style interfaces, in which content blocks are organised in separate cells that can be moved up or down a document, are starting to look as if their time may have come. Over the last week, I’ve come across two examples.

The first, an early preview of the OU’s under development OpenCreate authoring environment uses an outliner style editor to support the creation of a weekly study planner and topics within each week, and a notebook style interface for editing the topical content pages. I would show screenshots but I’ve pre-emptively been told to not to post videos or screenshots…

The second is an open project – a live demo and the code repository are available – and it comes in the form of Franchise, a simple app service that delivers a rich, browser based SQL query engine for querying simple data files (read the strapline and the name makes punful sense!).

Launching the service provides you with an interface that lets you connect to a database, or load in a data file, either by selecting it from a file browser or just dragging it onto the launch page.

Uploading a CSV document creates a SQLite3 database containing the data in a single table

Selecting the data table generates a SQL query that reveals the column names. Running the query generates a preview of the data table and also makes the output queryable as a “tagged” table.

The resulting table can then be queried using the tag name:

You can also use the download button to download the results table in a variety of formats:

If the resulting table has numerical columns, you can display the table using a range of charts, such as a bar chart.

For now, it looks as if the charts are quite simplistic – for example, we can’t stack or group the bars:

Other charts are available depending in a context sensitive way. For example, if there are two numerical columns we can plot a scatter char. Line charts are also available.

If the dataset contains latitude and longitude date, we we can use the data to plot points on a map.

For those new to SQL, there’s a handy cribsheet a the bottom of the page:

Franchise20

(If SQL is new to you, you might also find things like this handy: Asking Questions of Data – Garment Factories Data Expedition.)

We can also add textual commentary to the notebook in the form of markdown cells.

The markdown is styled responsively – but I couldn’t see how to go to to “preview” mode where the styling is applied but the markdown modifiers are hidden?

Cells are archived rather than deleted:

Although they can be deleted, as well as restored, from the archive.

Cells can also be reordered – click on the right hand siadebar of a cell to drag it into a slot above or below another cell, or alongside one.

Cells can also be duplicated, in which case they appear alongside the cloned cell.

The side by side view allows you to look at the effect of a changing a query compared to its original form.

I was quite excited by the idea that you could download the notebook:

and export it as an HTML file:

I had expected this to generated a standalone HTML file, but that appears not to be the case, at least for now. Instead, the cell data is packed into a JSON object:

and then passed to either a local Franchise server, or the web based one.

As a quick tool for querying data, Franchise looks to be pretty handy, although you soon realise how lacking in control it is over chart styles and axis labelling, for example (at least in int’s early form). If you could export standalone HTML, it would also make it more useful as an asset generating tool, but I guess it’s still early days.

According to a release thread – Franchise – An Open-Source SQL Notebook (franchise.cloud) – it looks as  if a standalone electron app version is on the way. (In the meantime, I followed the build instructions from the repo README to produce a quick docker container: psychemedia/franchise.)

The ability to get started querying data using SQL without the need to install anything offers a way of having a quick chat with a file based dataset. (I couldn’t get it to work with Excel or JSON files, and didn’t try a SQL file or connecting to a separate database server.)

At the moment, I don’t think you can connect to a Google spreadsheet, so you have to download one , although a SQL like API is available for Google Sheets (eg I used it for this simple SQL query interface to Google spreadhseets way back when).

From a getting started with data conversations perspective, though, this offers quite a nice on ramp to a SQL query environment without the need to worry about the DBA (database admin) chores of setting up a database, defining tables, importing the data and so on.

I also wonder if it might act as a gateway to more aggressive and powerful query engines that are capable of querying over large and multiple datasets contained in local files? Things like Apache Drill, for example?

See also:

 

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...