Beyond SQL
Learning outcomes:
- Describe the difference between SQL and NoSQL
- List some database options besides SQL that are commonly used
Would you like to download my PowerPoint to follow along?
- Pros and Cons of SQL
- Pros
- Very popular for a long time so there is a decent amount of documentation and some standards
- Relatively simple for people who aren't programmers to learn
- There are some built in security and data integrity features
- It's possible to see data in different views so you can have more abstraction and can limit user access/viewing when needed
- Cons
- Limits on what it can be used for since it requires structured data
-
- There can be hidden rules which make it harder to control fully
- No real time analytics, so things that want real time data processing is an issue
- Large datasets can lead to longer processing times because of the way queries perform
- Setup of SQL databases can be complicated and need specific skill sets
- Beyond SQL
- NoSQL is non-relational database model, literally non-SQL or not only SQL
- NoSQL is useful for larger datasets and real time analytics and data processing especially
- There are a lot of different ways you can organize NoSQL type databases, one is by data model
- For reference, SQL stores data in tables, NoSQL can store data in many other models including objects, documents and graphs
- Because the data is not done in a way that shows relationships, schema are designed differently
- NoSQL tends to not use joins in the queries because the data isn't stored in tables, NoSQL uses multiple queries instead
- NoSQL Databases NOTE: Very long paper, but has some really good examples and explanations
- Pros and Cons of NoSQL
- Pros
- Scalability for larger datasets
- Can store data outside the traditional SQL options
- Real time analytics is possible
- Less maintenance needed in general
- Cons
- Hasn't been popular as long so there has been less stress testing then SQL
- Less complex queries are done so usually multiple queries are used instead of table joins
- Additional infrastructure may be needed for scaling
- Difficult to change data once added
- SQL vs NoSQL
- SQL
- Relational
- Schema is predesigned
- Vertically scalable - more power added to scale database, such as upgrading current servers
- Based on tables
- Needs structured data
- NoSQL
- Non-relational
- Schema can be dynamic to accommodate unstructured data
- Horizontally scalable - more nodes are added to scale database, such as adding more servers
- Uses other data structures besides tables
- Can work with unstructured and semi-structured data
- Types of NoSQL
- There are multiple ways to classify NoSQL databases, some of which can overlap
- One common way to classify the types is by data structure, or the way in which the data is stored
- Some common types are graph, key-value and document
- The vocab that is used for the way the data is stored can be different than expected, especially if you're coming from programming
- NoSQL Example: Key Value Store
- This uses a data structure that is an associative array or dictionary if you're coming from programming
- This type will use key-value pairs, each piece of data has an associated key
- The unique key is how you can retrieve your data from the database, the data can be simple (string) or complex (object)
- One way to think of this is like a relational database but it only has two columns in the table, the key and the value
- NoSQL Example: Document Store
- Instead of using tables with rows and columns, there is a document to store the data
- Documents may be closer to the data objects so there may be less modifications needed to retrieve the data
- Collections are a group of documents with similar contents, you can also use things like tags or metadata to organize and group the data
- Flexible schema is used, the documents don't need to all have the same schema
- Document creation is easy, not much maintenance is required
- No foreign keys are needed because there is no relationship between documents required
- NoSQL Example: Graph Databases
- Best for data that has relationships that need a graph to represent them, such as networking topologies or social connections
- The focus Is on the relationship been the elements. The nodes on a graph can be connected by links
- Query can have real time results for output
- How fast results show depend on the number of connections or relationships
- Adding data is easy by adding new nodes or edges, schema doesn't need to be changed much
Suggested Activities and Discussion Topics:
Would you like to see some more classes?
Click here