Beyond SQL

Learning outcomes:

  • Describe the difference between SQL and NoSQL
  • List some database options besides SQL that are commonly used

Would you like to download my PowerPoint to follow along?

  • Pros and Cons of SQL
    • Pros
      • Very popular for a long time so there is a decent amount of documentation and some standards
      • Relatively simple for people who aren't programmers to learn
      • There are some built in security and data integrity features
      • It's possible to see data in different views so you can have more abstraction and can limit user access/viewing when needed
    • Cons
      • Limits on what it can be used for since it requires structured data
      • There can be hidden rules which make it harder to control fully
      • No real time analytics, so things that want real time data processing is an issue
      • Large datasets can lead to longer processing times because of the way queries perform
      • Setup of SQL databases can be complicated and need specific skill sets
  • Beyond SQL
    • NoSQL is non-relational database model, literally non-SQL or not only SQL
    • NoSQL is useful for larger datasets and real time analytics and data processing especially
    • There are a lot of different ways you can organize NoSQL type databases, one is by data model
    • For reference, SQL stores data in tables, NoSQL can store data in many other models including objects, documents and graphs
    • Because the data is not done in a way that shows relationships, schema are designed differently
    • NoSQL tends to not use joins in the queries because the data isn't stored in tables, NoSQL uses multiple queries instead
    • NoSQL Databases NOTE: Very long paper, but has some really good examples and explanations
  • Pros and Cons of NoSQL
    • Pros
      • Scalability for larger datasets
      • Can store data outside the traditional SQL options
      • Real time analytics is possible
      • Less maintenance needed in general
    • Cons
      • Hasn't been popular as long so there has been less stress testing then SQL
      • Less complex queries are done so usually multiple queries are used instead of table joins
      • Additional infrastructure may be needed for scaling
      • Difficult to change data once added
  • SQL vs NoSQL
    • SQL
      • Relational
      • Schema is predesigned
      • Vertically scalable - more power added to scale database, such as upgrading current servers
      • Based on tables
      • Needs structured data
    • NoSQL
      • Non-relational
      • Schema can be dynamic to accommodate unstructured data
      • Horizontally scalable - more nodes are added to scale database, such as adding more servers
      • Uses other data structures besides tables
      • Can work with unstructured and semi-structured data
  • Types of NoSQL
    • There are multiple ways to classify NoSQL databases, some of which can overlap
    • One common way to classify the types is by data structure, or the way in which the data is stored
    • Some common types are graph, key-value and document
    • The vocab that is used for the way the data is stored can be different than expected, especially if you're coming from programming
  • NoSQL Example: Key Value Store
    • This uses a data structure that is an associative array or dictionary if you're coming from programming
    • This type will use key-value pairs, each piece of data has an associated key
    • The unique key is how you can retrieve your data from the database, the data can be simple (string) or complex (object)
    • One way to think of this is like a relational database but it only has two columns in the table, the key and the value
  • NoSQL Example: Document Store
    • Instead of using tables with rows and columns, there is a document to store the data
    • Documents may be closer to the data objects so there may be less modifications needed to retrieve the data
    • Collections are a group of documents with similar contents, you can also use things like tags or metadata to organize and group the data
    • Flexible schema is used, the documents don't need to all have the same schema
    • Document creation is easy, not much maintenance is required
    • No foreign keys are needed because there is no relationship between documents required
  • NoSQL Example: Graph Databases
    • Best for data that has relationships that need a graph to represent them, such as networking topologies or social connections
    • The focus Is on the relationship been the elements. The nodes on a graph can be connected by links
    • Query can have real time results for output
    • How fast results show depend on the number of connections or relationships
    • Adding data is easy by adding new nodes or edges, schema doesn't need to be changed much

Suggested Activities and Discussion Topics:

Would you like to see some more classes? Click here