Speaker : Edin Kapic
Scalability issues never appear when developing or testing. That is why it is better to design for scalability upfront.
Autohosted applications are not suited for scalable architecture, as it can't be fine tuned. And because Apps are now running outside SharePoint, the first advice is to minimize the round-trips. Deploying an App does not necessarily mean scalability. It must be architected for it.
A scalable architecture should include a CDN, blob storage, table storage and distributed cache.
3 Guidelines : avoid roundtrips (caching, cdn), avoid bottlenecks (NoSQL, Sharding, queue), avoid single point of failure (redundancy).
Caching is the cheapest mechanism to avoid roundtrips. Stale data is the drawback. Local cache sits on each instance of the server or the service. Distributed cache is shared across the servers. Very frequently accessed data and static data should go in the local cache. But, by default, distributed cache should be used. An example of an effective cache mechanism is the DNS. There is a mini cache in the browser, a local cache at the operating system level, and finally, also at the DNS resolver level.
CDNs are used to cache large blob data. Each blob can have a public URL (public blob). Shared signature is a part of the URL to access the private blobs. The first user accessing content from the CDN will pay the price of putting the blob in the cache of the CDN. Everything that is static, such as images, scripts, media files should go in the CDN. In order to ensure that the correct version of the blob is accessed, the URL can contain a version parameter.
Storage locks are a reason for bottlenecks. Database locks appear when changing and reading requests are mixed. While relational data and SQL Azure provides immediate consistency, NoSQL or Table Storage there is an eventual consistency. CQRS is a pattern that splits database operations in queries and commands for different processing. Queries can be optimized by parallezing, whereas commands can't. SharePoint 2013 uses more or less the same pattern. Search for queries are cached, other operations are done in the content database. Sharding is partitioning data across multiple database. The tenant ID in O365 is used as a partition ID. This is a way to go beyond the storage limitations. On the other side, making joins operations are more difficult.
Reducing bottlenecks can also be achieved by using queues. But, request/response model does not scale well, and gets expensive very fast. Queuing requests, we add decoupling and retries can be implemented. DDOS can be preventing. If the requests in a queue gets high, it can be scaled. Azure storage queues are low level and uses TCP/IP (end-to-end scenario). If you need a centralized queue system, it is better to use Service Bus queues. To notify the front end that a job has been done, use framework like Signal-R. Async is a way to optimize the requests so that a same process can serve more than just one request. Async can work even in a single thread. Having multiple thread is better in case there are multiple cores. Until .NET 4.5, it was a bit difficult to implement such solution.
In redundant design, the goal is to avoid to rely on a single node, as the app must continue working if a node goes down. In redundant apps, each requests must be idempotent. Load-balancing is an example of redundancy. Azure Traffic Manager maintains a table of the available nodes by keeping probing them to check if they are online. When a request comes to the traffic manager, it defines which server is the most appropriate before returning the address to the client.