Databases#

This chapter will give a brief introduction to databases on a level sufficient for usage in the course. As many databases use SQL (Structured Query Language / “sequel”) or a variation of this, we will also show SQL examples.

Categories#

  • Relational databases/SQL, e.g., MySQL, Microsoft SQL, Oracle DB

  • NoSQL, e.g., Cassandra, MongoDB

    • Traditional tables, but no relations between tables

    • Vector databases

    • Graph databases

This list is non-exhaustive both with regard to categories and examples.

Relational databases#

  • Tables with fixed attributes

    • Each column has a name and a storage type.

    • One (ore more) column(s) defined as a key, only accepting unique values.

  • Tables are connected (relations) in one of three ways:

    • One-to-many, one-to-one, and many-to-many

https://github.com/khliland/IND320/blob/main/D2Dbook/images/Databases_SQL_relation.png?raw=TRUE

We will be using MySQL as an example of this category

NoSQL#

  • Cacheable, parallelizable, scalable to exabytes of data.

    • E.g., XML databases for querying XML documents based on attributes.

    • May not use tables at all, only key-value pairs and corresponding unstructured data/objects.

  • Some use a subset of SQL commands (NoSQL = Non-relational SQL).

    • Trades a bit more strict querying for speed and scale.

We will be using Cassandra as an example of this category.

Vector databases#

  • Like NoSQL, these are made to overcome limitations in relational databases.

  • Data stored as high-dimensional vectors representing features or attributes.

    • E.g., transformations or embeddings.

  • Applied in natural language processing (NLP), computer vision (CV), recommendation systems (RS), etc. when searching for similar objects.

    • E.g., find the image most similar to my search image.