Databases

Databases#

This chapter will give a brief introduction to databases on a level sufficient for usage in the course. As many databases use SQL (Structured Query Language / “sequel”) or a variation of this, we will also show SQL examples.

Categories#

Relational databases/SQL, e.g., MySQL, Microsoft SQL, Oracle DB
NoSQL, e.g., Cassandra, MongoDB
- Traditional tables, but no relations between tables
- Vector databases
- Graph databases

This list is non-exhaustive both with regard to categories and examples.

Relational databases#

Tables with fixed attributes
- Each column has a name and a storage type.
- One (ore more) column(s) defined as a key, only accepting unique values.
Tables are connected (relations) in one of three ways:
- One-to-many, one-to-one, and many-to-many

https://github.com/khliland/IND320/blob/main/D2Dbook/images/Databases_SQL_relation.png?raw=TRUE

We will be using MySQL as an example of this category

NoSQL#

Cacheable, parallelizable, scalable to exabytes of data.
- E.g., XML databases for querying XML documents based on attributes.
- May not use tables at all, only key-value pairs and corresponding unstructured data/objects.
Some use a subset of SQL commands (NoSQL = Non-relational SQL).
- Trades a bit more strict querying for speed and scale.

We will be using Cassandra as an example of this category.

Vector databases#

Like NoSQL, these are made to overcome limitations in relational databases.
Data stored as high-dimensional vectors representing features or attributes.
- E.g., transformations or embeddings.
Applied in natural language processing (NLP), computer vision (CV), recommendation systems (RS), etc. when searching for similar objects.
- E.g., find the image most similar to my search image.