MongoDB#

To set up a MongoDB cluster, follow these steps:

https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.001.png?raw=TRUE

Create a free user#

https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.002.png?raw=TRUE

Select a username and password to be used through Python#

(Quickstart menu on the left) https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.003.png?raw=TRUE

Open access for connection#

Add IP address 0.0.0.0/0 to avoid connection issues, then “Finish and close” at the bottom. https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.004.png?raw=TRUE

Choose connection type#

From the Overview menu, select Python as Application Development and click “Get connection string” https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.005.png?raw=TRUE

Follow the instructions (make sure to install from/in the correct Python environment).
“View full code example” for test code https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.006.png?raw=TRUE

MongoDB’s suggested code for checking if database is reachable#

NMBU’s VPN service seems to block a port related to MongoDB, thus needing to be disabled before running.

from pymongo.mongo_client import MongoClient
from pymongo.server_api import ServerApi

USR, PWD = open('../../../No_sync/MongoDB').read().splitlines()

uri = "mongodb+srv://khliland:"+PWD+"@ind320.rhc2o.mongodb.net/?retryWrites=true&w=majority&appName=IND320"

# Create a new client and connect to the server
client = MongoClient(uri, server_api=ServerApi('1'))

# Send a ping to confirm a successful connection
try:
    client.admin.command('ping')
    print("Pinged your deployment. You successfully connected to MongoDB!")
except Exception as e:
    print(e)
Pinged your deployment. You successfully connected to MongoDB!

Connecting#

from pymongo.mongo_client import MongoClient

# Find the URI for your MongoDB cluster in the MongoDB dashboard:
# `Connect` -> `Drivers` -> Under heading 3.
uri = ("mongodb+srv://{}:{}@ind320.rhc2o.mongodb.net/"
       "?retryWrites=true&w=majority&appName=IND320")

# Connecting to MongoDB with the chosen username and password.
USR, PWD = open('../../../No_sync/MongoDB').read().splitlines()
client = MongoClient(uri.format(USR, PWD))

# Selecting a database and a collection.
database = client['example']
collection = database['data']

Inserting#

  • The MongoDB structure is such that each database contains collections.

  • These collections contain documents, which are similar to dictionaries.

  • Thus, when inserting data, we use dictionaries.

# Inserting a single document (dictionary).
collection.insert_one({'name': 'Hallvard', 'age': 23})

# Inserting multiple documents (list of dictionaries).
collection.insert_many([
    {'name': 'Kristian', 'age': 27},
    {'name': 'Ihn Duck', 'age': 15},
#    {'name': 'Ihn Duck', 'age': 16},
])

# Note that an _id field is automatically generated by MongoDB.
InsertManyResult([ObjectId('68dd76f7c706efef5a2a46e6'), ObjectId('68dd76f7c706efef5a2a46e7')], acknowledged=True)

Check existence of record https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.008.png?raw=TRUE

https://github.com/khliland/IND320/blob/main/D2Dbook/images/mongodb/setup.009.png?raw=TRUE

Reading#

# Reading ALL documents from a collection.
# ........................................

documents = collection.find({})
# A cursor is returned.

# The cursor can be iterated over:
for document in documents:
    print(document)

# Or directly converted to a list:
#documents = list(documents)
{'_id': ObjectId('68dd68366a9739818703389b'), 'name': 'Hallvard Lavik', 'age': 24}
{'_id': ObjectId('68dd68366a9739818703389c'), 'name': 'Kristian', 'age': 28}
{'_id': ObjectId('68dd68366a9739818703389d'), 'name': 'Ihn Duck', 'age': 16}
{'_id': ObjectId('68dd76f6c706efef5a2a46e5'), 'name': 'Hallvard', 'age': 23}
{'_id': ObjectId('68dd76f7c706efef5a2a46e6'), 'name': 'Kristian', 'age': 27}
{'_id': ObjectId('68dd76f7c706efef5a2a46e7'), 'name': 'Ihn Duck', 'age': 15}
# Reading SPECIFIC documents from a collection.
# .............................................

hallvard = collection.find({'name': 'Hallvard'})

for document in hallvard:
    print(document)

hallvard = list(hallvard)
{'_id': ObjectId('68dd76f6c706efef5a2a46e5'), 'name': 'Hallvard', 'age': 23}

Updating#

  • Updating documents is done using the update_one and update_many methods.

  • The first argument is a query that selects the documents to update.

  • The second argument is a dictionary that specifies the changes.

# Updating a single document.
# ...........................
collection.update_one(
    {'name': 'Hallvard'},
    {'$set': {'name': 'Hallvard Lavik'}}  # Sets the `name` to `Hallvard Lavik`.
)

# Updating multiple documents.
# ............................
collection.update_many(
    {},
       {'$inc': {'age': 1}}  # Increments the `age` of all documents by `1`.
)
UpdateResult({'n': 6, 'electionId': ObjectId('7fffffff00000000000000f0'), 'opTime': {'ts': Timestamp(1759344375, 9), 't': 240}, 'nModified': 6, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1759344375, 9), 'signature': {'hash': b'\xfa\xeb\xe5K\xc6\xe84y\x00\xa8+^=Y\x8eD\x9d\x1dM\xb5', 'keyId': 7512488947117719554}}, 'operationTime': Timestamp(1759344375, 9), 'updatedExisting': True}, acknowledged=True)

Aggregating#

  • Combine multiple operations into a single query.

pipeline = [
    {'$match': {'age': {'$gt': 20}}},
    {'$group': {'_id': None, 'average_age_over_20': {'$avg': '$age'}}},
]
result = collection.aggregate(pipeline)
result = list(result)
print(result)
[{'_id': None, 'average_age_over_20': 26.5}]

Accessing through Streamlit#

  • Several ways are possible.

  • The Python code below assumes a secrets.toml with a [mongo] section.

    • Exchange khliland, ind320, rhc2o, IND320, example and data with your equivalent data.

    • Exchange “+PWD+” with the actual password.

    • Copy to streamlit.io when deploying.

[mongo]
uri = "mongodb+srv://khliland:"+PWD+"@ind320.rhc2o.mongodb.net/?retryWrites=true&w=majority&appName=IND320"
# mongodb.py
import streamlit as st
import pymongo

# Initialize connection.
# Uses st.cache_resource to only run once.
@st.cache_resource
def init_connection():
    return pymongo.MongoClient(st.secrets["mongo"]["uri"])

client = init_connection()

# Pull data from the collection.
# Uses st.cache_data to only rerun when the query changes or after 10 min.
@st.cache_data(ttl=600)
def get_data():
    db = client['example']
    collection = db['data']
    items = collection.find()
    items = list(items)
    return items

items = get_data()

# Print results.
for item in items:
    st.write(f"{item['name']} is {item['age']} years old")
path_prefix = "/Users/kristian/Documents/GitHub/IND320/streamlit/"
file_name = "mongodb.py"
# !streamlit run {path_prefix}{file_name}

Deleting#

  • Deleting documents is done using the delete_one and delete_many methods.

# Deleting a single document.
# ...........................
collection.delete_one({'name': 'Ihn Duck'})  # Deletes the document with `name = Ihn Duck`.
DeleteResult({'n': 1, 'electionId': ObjectId('7fffffff00000000000000f0'), 'opTime': {'ts': Timestamp(1759344375, 10), 't': 240}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1759344375, 10), 'signature': {'hash': b'\xfa\xeb\xe5K\xc6\xe84y\x00\xa8+^=Y\x8eD\x9d\x1dM\xb5', 'keyId': 7512488947117719554}}, 'operationTime': Timestamp(1759344375, 10)}, acknowledged=True)
# Deleting multiple documents.
# ............................
collection.delete_many({'age': {'$gt': 25}})  # Deletes documents where `age > 25`.

# Deleting all documents.
# .......................
collection.delete_many({})
DeleteResult({'n': 3, 'electionId': ObjectId('7fffffff00000000000000f0'), 'opTime': {'ts': Timestamp(1759344375, 14), 't': 240}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1759344375, 14), 'signature': {'hash': b'\xfa\xeb\xe5K\xc6\xe84y\x00\xa8+^=Y\x8eD\x9d\x1dM\xb5', 'keyId': 7512488947117719554}}, 'operationTime': Timestamp(1759344375, 14)}, acknowledged=True)