Azure Table Storage: Essential Knowledge
- Azure tables are used to store non-relational structured data at massive scale
|What you get||How much|
|Compute||750 small compute hours per month|
|Web sites||10 web sites|
|Mobile services||10 mobile services|
|Relational database||1 SQL database|
|SQL reporting||100 hours per month|
|Storage||35GB with 50,000,000 storage transactions|
|Bandwidth||Unlimited inbound & 25GB outbound|
|CDN||20GB outbound with 50,000 transactions|
|Service bus||1,500 relay hours and 500,000 messages|
|Sign Up Link|| |
- There are many types of storage options for the MS cloud. We will focus on Azure tables.
- Here is what we'll cover:
- When to use Azure Tables
- When are the appropriate to consider
- Understanding that Azure Tables are collection of entities
- Access Azure Tables directly or through a cloud application
- Key Features of Azure Tables
- Relationship between accounts, Tables, and entities
- Efficient Inserts and Updates
- Designing for scale
- Query Design and Performance
- Understanding Partition Keys
- How data is partitioned
- Coding considerations
- Azure Table Query Concepts
- Understanding TableServiceEntity/TableServiceContext
- Additional Resources
- These are some typical use case scenarios for using Azure tables.
- Azure tables are optimized for capacity and performance (scale)
- SQL Database is limited currently to 150 GB without federation. Federation can be used to increase the size beyond 150 GB.
- If your code requires strong relational semantics, Azure tables are not appropriate. They don't allow for join statements.
- You can think of Azure tables as nothing more than a collection of objects. Note that each entity (similar to a row in a table) could have different attributes. In the diagram above, the second entity does not have a city property.
- One of the beauties of Azure Tables is that your can replicate across data centers, aiding in disaster recovery.
- A table is a collection of entities.
- An entity is like an object. It has name/value pairs.
- An entity is kind of like a row in a relational database table, with the caveat that entities don't need to have the exact same attributes.
- Any application that is capable of http is capable of communicating with Azure tables. That is because Azure tables are REST-based. This means a Java or PHP application can directly perform CRUD (create, read, update, delete) operations on an Azure Table.
- Azure cloud applications can be hosted in the same data center as the Azure Table Storage. The compelling point here is that the latency from the cloud application is very low and can read and update the data at very high speeds.
- One of the key features of Azure tables is the low cost. You can use the Pricing Calculator to determine your predicted costs at http://www.windowsazure.com/en-us/pricing/calculator/
- It is important to remember that Azure tables are non-relational and therefore joins are not possible.
- Azure tables can automatically span multiple storage nodes, maintaining performance. This is based on the partition key that you define. It is very important to consider the partition key carefully as it determines performance.
- Transactions can occur only within Partition Keys. This is another example of why you must carefully consider Partition Keys.
- The data is replicated 3 times, including alternate data centers.
- Note that an account can have multiple tables and that each table can have one or more entities.
- Note the URL that is used to access your tables. This is the URL that any client that is http-capable can use.
- Special semantics are available to make inserts and updates efficient. The bottom line is that you can do either an update or insert in just one operation.
- The Partition Key and RowKey are required properties for each entity. They play a key role on how the data is partitioned and scaled. They also determine performance for various queries. As mentioned previously, they also play a role in transactions (transactions cannot span Partition Keys).
- How to issue efficient queries will be addressed later in this post.
- Performance is always an important consideration. The spectrum of speed varies considerably, depending on the type of query you issue. Specific examples are provided later in this post.
- This slide illustrates how your entities get distributed across partition nodes. Note that the partition key determines how data is spread across storage nodes.
- The key point here is that every entity is uniquely identified by the combination of partition key and row key. You can think of partion key and row key together being similar to a primary index in a relational table.
- Azure will automatically manage both the partitioning and the replication of your entities. I am trying to emphasize how important it is to consider the partition key and row key.
- Note that Query 1 is fast because it performs and exact match on partition key and row key. It only returns one entity.
- Query 2 is slower than Query 1 because it does a range-based query.
- Query 3 is slower than Query 2 because it doesn't leverage the row key.
- Queries 4 and 5 are very slow because they don't use the partition key. This is equivalent to a full table scan with SQL Server. You want to avoid this at all costs. You may need to re-consider your partition keys and row keys if you find yourself issuing these type of queries.
- You may even want to keep duplicate copies of your data in other tables that are optimized for certain types of queries.
- The table above stores email addresses. The partition key is the domain part of the email address and the mailname is the row key.
- TableServiceEntity and TableServiceContext are used when programming with C# or Visual Basic. By deriving from TableServiceEntity you can define your own entities that get stored in tables. TableServiceContext is used when you wish to perform CRUD operations on tables and is not illustrated here.
- The Windows Azure Training Kit is the best way to get up and running.
- One of the labs is called Exploring Windows Azure Storage. It provides excellent examples on using storage.
- It can be found here (once you install the training kit) C:\WATK\Labs\ExploringStorage\HOL.htm
I appreciate that you took the time to read this post. I look forward to your comments.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)