公告:2024年4月15日起本站用户注册、新用户的前10个编辑需要审核,如果您的编辑没有立即显示,请等待管理员审核完毕。

WinFS Blog:My tryst with Destiny, err… Integrated Storage

来自BetaWorld 百科
跳转到导航 跳转到搜索
February 24, 2006

Hi, my name is Sanjay Anand. I run the Program Management team for WinFS.  I’ve been at Microsoft for almost 12 years now in which time I have been involved with networking, security, messaging, file systems, and of course integrated storage 🙂


Prologue:


A very very long time ago, my friend and I joined Microsoft to work on “next generation” Operating Systems – I was on Windows NT, he was on Cairo. Cairo was this sexy, glitzy, very futuristic attempt at building an integrated store - an object file system. This was the thing that would replace Windows NT the moment it shipped. I plodded along on the not-so-new NT, knowing that Cairo would come along and replace most of what we were working on. Then one day, Cairo disappeared and the machinery moved on to salvage parts of Cairo to deliver on the premium OS – Windows NT. At that time I didn’t think much of the whole thing except that it was perhaps far ahead of it’s time…


Webstore circa 1999:


A half-decade later, I encountered “Integrated Storage” for the second time while working on the Exchange Webstore. The premise was to build a single enterprise store that would have integrated manageability, multi-tier deployment, supported multiple protocols and data access stacks and a platform for app development. It was a file server, a web server, a corporate email server, internet email server, collaboration server, all rolled into one. The file server in particular was the piece I worked on. We actually shipped this product in Exchange 2000, but it was still-born. The technological underpinnings weren’t quite there – it lacked real transactions, had issues with scalability, was too mail-oriented. The win32 support was too hard – we mastered the entire file system in a single file, literally re-implementing NTFS within a file, with our own block management strategy, our own locking implementation, complex interactions with the cache manager – oh! It was a nightmare, but we got it limping along. A brave effort nonetheless and something we were to learn from eventually!


The client manifestation of this webstore was called the LIS (local information store) – a valiant attempt at building a client platform to match the Webstore on the server. Office Designer was the dev environment on the client and Outlook was the marquee application on the client to use that store. The beginnings of the Outlook offline store took root here. The challenges were two-fold: take the server store and “shrink” it to the client and provide a MAPI shim over an HTTP wire protocol for the sync. The performance never really matched the requirements of Outlook, the store was too general purpose... this went by the way side as well.


Fastforward to 2003:


Another half decade later, we launched with renewed energy towards the 5th attempt (and my second) at Integrated Storage – WinFS. A lot of factors were in our favor – the need for a storage platform was more acute than ever with digitally born data exploding every day, a lot of technological underpinnings were there for the taking: transactions in file systems, mature databases, sophisticated programming languages, but most importantly, a set of very motivated individuals who had deep experience in the various elements required to bring together this complex technology. We had folks with deep database experience from QO/QP to transactions, aces in programming languages,  folks with emmense experience in file systems, distributed systems, web services, hard core synchronization wonks, O/R mapping gurus, but above all, a fearless leader who was ready to take a real shot at this thing once and for all and who had the maturity and discipline to control the engineering of this innovation – Peter Spiro.


The project started with much fanfare - a who’s-who of very senior people passionately oversaw the vision, tons of internal teams hooked into the common schema effort - identity services, state management, windows help, natural languages, Windows shell, Pix, Media, etc. At one point we also had over 80 external ISV’s who were at various levels of commitment. These were heady times; we knew these were also unsustainable for any single platform to serve, but we also knew these engagements were essential to distilling the core nuggets of innovation to target for a v1. The Vista change occurred and while negative on the surface, from a project perspective it actually helped us take pause, distill our core value props and run really fast at hitting these. There are many ways where WinFS adds value, however I want to touch upon one that really appeals to me: a storage platform for application development.


Storage platform:


First off, we are about building a storage platform that enables applications to model their data in a much richer fashion, ''while still benefiting from common services one would expect from a file system: administration, sync, sharing, security, backup, drag-n-drop, move/copy, delete and yes, search.'' File systems don't allow applications to model their data in anything richer than byte streams when it comes to interacting with other applications, Windows shell, etc. Databases allow you to model your data in richer ways by leveraging relational capabilities, however none of these automatically enable that data to be shared with the Windows Shell much as a file is. So, there you have it - we are the best of file systems (Integration and services) and databases (rich data modeling capabilities) rolled into one - hence Integrated Storage.


Now, lets move further and see how you can do much more if you actually share that rich data definition across applications. Let me take an example: reusable organization is an extremely powerful concept – something that application-specific tagging cannot even come close to achieving. Consider this scenario: at each of my daughter’s b’day celebrations at her school, we take in a poster of her pictures from previous years. I search for the pictures in one app (Digital Image Suite), auto-fix in another (Photoshop), print using HP’s custom album printing app. Each app has some notion of query-based searches, but no two share it. So, whenever I transition between applications, I end up duplicating my photos into folders since that’s the common denominator across applications. Then, I go through and delete these duplicates which I have collected over time, including copies of edited ones, ones I staved off to put on ophoto.com, post on spaces, email to my sister, the list goes on and on. Let me admit that one of the most difficult item types to delete is a photo – you can buy music again, re-write a document, but not a photo. So, I end up not deleting the photos many times… Yes storage is cheap, but this is ridiculous – and I am a reasonably tech savvy person... I recognize that there are several answers to this scenario: build a better single app to do the entire photo workflow; standardize on a common tag file format, etc. However, the same scenarios apply to music – my organization in Media Player is lost in iTunes. I use both: the first to stream the second for my iPod. In fact they apply to all forms of digital data.


The point is that the file system today is too primitive a substrate for these applications to work over and WinFS attempts to close this gap by raising the level of involvement in the application’s data: it’s a trade – the more apps tell the system about their data, the more the platform does for their types in terms of allowing an ecosystem of applications to interact with their content. The user organizes her content in one app - sets up a query to filter down the photos – “all 5-star photos of my daughter that I haven’t used earlier, grouped by year taken, ordered by resolution” and another app uses these queries to edit the set, etc. This is achievable with a rich underlying storage platform, just not today.


Now and beyond:


At this point, we have perhaps the best shot at this than ever before – we have robust file system support – perhaps the most optimal implementation one could hope to build. We have gone through a number of iterations on our data model so have higher confidence that we are closer to the eventual answer ;-). A great platform needs a great API and we have that with our alignment with ADO.Net vnext, we have solutions to hard questions around business logic, cross-store versioning, security, a strong synchronization story, the list goes on and on. There is a lot more to solve and gell, but this for me is by far closest to hanging together.


I am also a '''huge''' believer in learning from shipping and real world exposure. So, right now, it’s on to shipping this baby and getting it in your hands. As with any platform, the real proof lies in the universe of applications and how they build on that platform. Non-Microsoft applications, btw 😉


The dream continues…

Author: Sanjay Anand