Database Documentation

Learning outcomes:

  • Describe the importance of database documentation
  • Describe the purpose and benefits of maintaining database documentation
  • Demonstrate how to create and update a data dictionary

Would you like to download my PowerPoint to follow along?

  • What is documentation
    • Documentation is used as way to layout and explain what's going on for reference
    • This should include descriptions and notes of whatever you're trying to keep a record for
    • This is an instruction set that is easy to share with others
    • Documentation is like a love letter to your future self, it's a way to say "future self, I love you so much I'm writing down everything I'm doing so you don't have to remember it later" If you don't believe me try and recall what you were doing 6 months ago in detail or a year ago, in detail.
    • Database Documentation
  • Why documentation is important
    • Having everything written down makes it easier to share with others and reference later
    • Making sure you have the notes of what you did is important for when you have to redo or fix things
    • Documentation is useful in a lot of places, not just for databases, it's a way of making sure everyone is in agreement for what's going on
    • Having documentation also means decisions can be made and recorded so you can always look back on what you need to do if you switch projects
    • Documentation can be useful for the creators, but also the end-user
  • Documentation timing
    • There are a number of times you might create and update your documentation
    • Design documentation before database is created and write up everything you know about how you want your database to work
    • Updating documentation as database is used so you can fix "In my ideal world this would happen" when it crashes into "The real world of data wants to talk"
    • Documentation should also be reviewed on a regular basis to make sure it's staying up to date
  • Automated vs manual documentation
    • Some documentation can be automated by the database and you can run commands to update it
      • Schema can be created by the database after it's created so you can see how close your original planning schema is to the real world
      • Automating where you can is important so that your documentation is maintained well
    • Some documentation needs to be done by a human
    • Sometimes we use a combination
      • For example, Data Definition Language (DDL) scripts are useful for creating and modifying things, but the scripts themselves need to be documented so that everyone knows what they should be doing
  • How to combine manual and automated documentation tools
    • Automate away what you can, if there is a reasonable option to have something automatically update the documentation, do it
    • Have a human check everything over for making sense and matching
    • This can be a reasonable use case for AI, but only if it meets privacy and security standards (most don't especially on sensitive data)
  • Data dictionaries and business logic
    • Data dictionaries are places where you have descriptions of your data/table/function
    • Having a well-documented data dictionary ensures you have consistency across the project including any conventions required
    • Business logic is the information you need to decide what data you need, how it's stored, and what types if any are expected
    • Business logic is the real world constraints or rules you have to follow, such as work flows and access
    • Business logic might include things like what data is considered sensitive and should have extra protections
    • Business logic might also have rules for how the database can be communicated with by end-users, or end-user applications
    • Should You Keep Your Business Logic In Your Database?
  • Version control
    • When you have something important like documentation or code you should keep it under version control
    • Version control is a way to track and manage updates and changes
    • Version control should keep a record of all changes, who made them, and when they were made, along with the ability to roll back to a previous version
    • Version control can be done in house or outsourced to something on the cloud
    • GitHub is a public and free version control option for example
    • Database Version Control
  • Documentation best practices
    • Documentation for any Data Language (DDL) scripts or other procedures and functions
    • Include the ER diagrams and schema
    • Descriptions for the data such as the tables, index and constraints or rules. This should include clearly labeled keys
    • Any business logic that is required should be included
    • Version control
    • Regular updates
    • Backup procedures and implementation guidelines
  • How to share your documentation with others

Suggested Activities and Discussion Topics:

Would you like to see some more classes? Click here