Database Matchmaking

Learning outcomes:

  • Describe how to evaluate database types for different situations
  • List database types with appropriate data

Would you like to download my PowerPoint to follow along?

  • Why it's important to pick a database to match your data/company
    • A lot of times people tend to go with what they know, but that might not always be the optimal solution
    • Starting with the simple "Do I use relational databases or not?" is a good place to start narrowing down the myriad options (estimated at over 300)
    • Each database has its own pros and cons, and it's important to think about those with your use case and dataset
    • There are likely several good or good enough options, don't let perfect be the enemy of the good
    • How to Choose The Right Database for Your Application
    • Choosing a Database for Your Project: Step by Step
  • Considerations
    • Integrations - What systems does your database need to work with
    • Scaling - How large you think your database will be, and how many people will be accessing it either for viewing or changing
    • Support - How much money will you be spending to make sure the database is being maintained properly
    • Those things will end up being the basis for how you can decide which database is the right one for you
  • Some important questions that need to be asked
    • How much data are you working with?
    • How many people add data? See data? Access the database?
    • Do you have a preferred language for programming?
    • What is the budget?
    • Do you have availability requirements of the database?
    • What scalability do you expect to need?
    • Do you already have anything in play that needs to be taken into consideration? Such as tools and services you need to integrate with?
    • What kind of support/staff do you have? Knowledge base?
  • Relational vs Non-Relational database
    • Relational
      • Structured data required
      • Needs relationships between tables of data to make sense
      • Useful for ACID transactions (atomicity, consistency, isolation and durability) Basically data integrity during processing and important for places where lost data is BIG uh ohs like anywhere that handles big $$
    • Non-Relational database
      • Can take any data
      • More flexible options for how data relates, or doesn't relate, to other stored data
      • Better at scale, so larger volumes of data can be handled like data warehouses and data lakes
      • Good for when you want to make quick changes so you can speed up your development cycles because they don't have schemas
  • Scaling
    • There is horizontal scaling and vertical scaling
      • Vertical scaling is when you can add more power to a server that holds your database
      • Horizontal scaling is when you add more servers to your database
    • Scalability is a HUGE issue in industry (pun intented)
    • The ability to get more data, more users, more more more, is one of the things that can really hold companies back, and the problem of scalability is one that most large companies are continually trying to solve and make more efficient. Moar for less $$ basically
    • The reverse can be an issue too where people are trying so hard to account for potential scale they complicate their world unnecessarily
    • For example, if you are a 3 person local shop with a small customer base and very little online inventory, do you really need a full NoSQL database implementation for everything? Likely no, no you do not
  • Schemas and data models
    • The type of data set you are using should dictate how you need to store and use your data
    • A database with a schema that is relational makes a lot of sense for something like inventory, or customers, or employees, but it might not make sense for something like social media profile scrapings or videos of cats
    • The data you have might need/want a specific database type if you have something besides traditional text based data like images or videos or music
  • Security and compliance
    • One thing you may have to take into consideration is Industry compliance requirements
      • If you are in a highly regulated industry, you likely have extra rules to follow
      • For example, if you are in healthcare, you have to follow HIPAA, therefore any database you use must also follow HIPAA for data security and privacy, including database encryption options
    • Your specific company might have compliance requirements, For example, some places have preferred vendors and if you can stick with that vendor there is less paperwork
    • One large note here, open source. Open Source is awesome and information wants to be free, BUT not all companies are on board with it, and open source software can let in unexpected vulnerabilities that you need to be aware of
    • Another thing that can happen is a company can divest themselves of risk by having a third party be responsible for the data
  • Examples of databases on the market
    • Relational
      • MySQL
      • SQLite
      • Oracle
      • PostgreSQL
      • MariaDB
      • SQLServer
      • Technically cloud companies have relational database on offer as well, but tend to advertise compatibility with other popular options instead
    • Non-Relational
      • MongoDB
      • Redis
      • Cassandra
      • GraphQL
      • Neo4J
      • ElasticSearch
      • All cloud companies have their own propriety options for NoSQL as well
  • Evaluation of what to purchase/use
    • First and foremost, what is your budget. Infrastructure? Maintenance? DBAs?
    • Are you doing this in house or asking a third party to take care of it for you?
    • What skills does your team/company already have and what are they comfortable learning?
    • How quickly do you need this up and running? Both the "in an ideal world" and the real world options needed
    • Are you looking at "good enough for now options"? And if you need to transfer the database how bad will the transfer be?
    • How easy is the upgrade? Do you have the people/skills to keep this up to date?
    • Who holds the purse strings, and do they understand what you need? It's very common for software to be sold to people that aren't using it so the flashy stuff they show isn't what the people using it need
  • Examples of free to use databases
    • Free databases are actually very commonly used, but they will have different license options and may have limits on who can use them and how
      • Remember, Open Source is awesome, but it may not work for your company/situation for a multitude of reasons
      • Sometimes companies will have free community versions, they are not all the same! Some communities are more helpful then others
    • The biggest issue with free databases is the support, or lack thereof. Remember, free database doesn't have a help desk if you need it and have questions. Sometimes it's worth the money to have someone on call to help fix your issues
      • Support services are NOT cheap, and tend to have frequent and surprisingly high fees. A lot of places are going with a pay to play model and you will commonly see monthly charges for the privilege of using them
    • Free database examples: MariaDB, PostgreSQL, MySQL, MongoDB Community version, Redis Community version
  • Examples of databases and their potential use cases
    • PostgreSQL - Financial information, such as ATM transactions
    • Elasticsearch - Log analytics, such as looking through server logs for specific information
    • Redis - Web hosting service, such as finding the static information about web pages quickly
    • Cassandra - Videos, such as where you left off when watching a video, think of any major video platform and how they remember where you stopped watching
    • MongoDB - News, such as the place that news articles might be held, because news articles are long form text
    • Neo4J - Real time recommendations, such as suggestions for which products or services you might be interested in

Suggested Activities and Discussion Topics:

Would you like to see some more classes? Click here