I believe that storage will be in the vanguard of enterprise cloud adoption. One reason is that storage is simple (esp vs. apps), and thus easier to experiment with. Another is that the rate of demand for storage is increasing faster than the cost of storage is decreasing while IT budgets remain flat – so anything that purports to increase storage efficiency will likely be at least tried. As with all new technologies, there’s always roadblocks to adoption. The most daunting today seem to be security and potentially higher costs versus on-premise storage. Security has at least 2 aspects – the technical and the legal. Most people would agree that a well run cloud storage service provider is likely to be more technically secure than the average enterprise, so concerns in this arena are far more to do with perception than reality. The legal aspect of security, however, is a real sticking point, and involves the need to comply with a whole slew of industry specific and geopolitical regulations. This suggests opportunities for startups with technologies that could automatically classify data into “cloud ready” and “in-house only” buckets, such as Expert System, Textwise and Syntactica. Costs can be broken down into the cost of the cloud storage itself and the cost of the bandwidth required to convey the bits to/from the cloud. With regard to the former, it is generally true that a well utilized and properly managed in-house storage infrastructure is cheaper than its equivalent cloud storage counterpart. For all the hype about cloud storage “turning capex into opex” – the fact is that capex can be translated into opex by a simple division of the expected lifetime of the storage infrastructure, and it’s not all that uncommon for it to be cheaper to buy than rent. One of the reasons for this cost disparity is the lack of a cloud storage hierarchy. In the on-premise storage world, there’s a whole spectrum of storage media ranging from the high price/fast access solid state drives to low price/slow access tape drives. Thus an appropriate blend of these tiers tends to be cheaper than the always-on, one-price-fits-all cloud storage hawked by most cloud storage providers. Offering multiple tiers of cloud storage, such as what Diomede Storage proposes, is one way to close the gap. Another way to overcome the cloud storage cost issue is to use a radically different backend architecture, such as p2p. This is highly risky – many enterprises tend to have violent emotional reactions to p2p schemes, regardless of how technically secure they actually are, and it doesn’t help that there’s a crowded graveyard of (mainly consumer-oriented) p2p storage startups. Symform and Comvaya are a couple of brave young companies that are pursuing this approach. Moving on to bandwidth costs – it is true that any use cases that involve the frequent movement of prodigious amounts of data (e.g. primary backup) are very unlikely to be economic for cloud storage in the near term. That being said, there are plenty of other use cases (e.g. archiving) that can be quite appealing. In some of these, it’s not uncommon for a WAN link pointing to a secondary data center to already be in place, and there’s no additional cost to simply redirect it at a cloud storage service provider. One bandwidth-related opportunity that’s often overlooked is the ability to automatically move data closer to the applications that actually use it, which can have dramatic impact on application performance. This is, in effect, delivering previously unaffordable CDN-like technology to enterprises. Asankya and Pixel8 are examples of companies doing interesting things in this space.
Many startups will salivate at the prospect of grabbing a slice of the search market. And there certainly have been many strenuous exertions – with everything from local search to vertical search to semantic search to real time search to novel search visualizations & interfaces etc. etc. etc. The result? Continued Google Dominance. The fact is that Google owns search today, and it is quite unwise for a startup to challenge them on their home turf. Rather than trying to out-search Google, a shrewder approach would be to figure out how to break search. What can you do that will make search irrelevant & obsolete? Let’s say you’re a tourist visiting a foreign city 15 years ago. How would you find the things that interest you? Most likely, you’ll go to the local tourist information office or look up a guide book – the search engines of the pre-internet era. Now, what alternatives do you have that will enable you to completely bypass those resources? First, perhaps you have a friend living in the city that you can ask. In other words, a mode of discovery based on trust and relationships. This in effect is the gravy train that the owners & operators of social graph data have been frenetically chasing, albeit futilely (for now). I think it’s a matter of when, and not if, someone cracks the code. And I think it’s going to be much more about calibrating the right user interface than about conjuring flashy new algorithms. Second, suppose you possess a telepathic ability to know what tourists similar to you enjoying doing in this city. In other words, discovery based on what people like you like. In practice, this usually involves various schemes of coaxing people to part with personal information in exchange for suggestions, such as bookmark sharing, or perhaps aggregating & mining reams of personal browsing histories to generate recommendations. Third, let’s say you have a crush on Jennifer Lopez, and thus want to go stay at the hotel she stayed at, eat at the restaurants she ate at, shop at the stores she shopped at etc. In other words, discovery based on what people you admire like. This could be particularly interesting as it ties in closely with e-commerce and has the potential to produce many unexpected pivots between people, media and objects of desire. I’m sure there’re infinite other strategies that entrepreneurs can dream up. To sum up, if you aspire to beat Google, it’s not about cooking up a better search, it’s about how you can devise ways to shrink the search market. Commoditizing your products’ complements is a shrewd tactic, given that the lower their price, the higher the demand for your products. For example, free & abundant roads did wonders for US automobile sales. And the perpetual downward spiral of PC hardware prices contributed much to the corresponding proliferation of PC software. With that thought in mind, it’s instructive to review the Open Cloud Manifesto, a vaguely worded declaration on cloud interoperability that could have just as easily sprung from the bowels of the United Nations. There was much hullabaloo about who signed and who didn’t, the latter including several companies that often take the moral high ground on standards and openness. Ultimately, it all boils down to who has the most to gain from commoditizing the cloud. Peddlers of complementary products like consulting services and hardware have given the Manifesto their tightest embrace, while cloud service providers, unsurprisingly, will continue to be conspicuously absent. It wasn’t that long ago when it was still fashionable to debate the relative merits of good old Presentation Server versus this new “VDI” thing. There was even a memorable session at a virtualization conference that featured a group of server based computing evangelists dueling against the backdrop of two giant boxing gloves, concluding with the solemn prediction that VDI will soon “knock out” its more mature (and presumably obsolete) relative. Today, it’s becoming obvious that there are not just two, but many, many ways to do desktop virtualization – separations can be made at the bare metal, operating system, application, user state or presentation layers (plus several variants in between), each with their own pros/cons. So rather than having “one that rules them all”, it’s likely that the world ahead will fragment into a multiplicity of desktop virtualization flavors, each with their own niche of end-user scenarios. The challenge, though, is that IT administrators won't want to have to manage each flavor of desktop virtualization in their respective silos. I believe that some kind of universal management system that can dynamically compose the appropriate user experience based on the endpoint context will likely be very valuable in the years ahead. Today, IT administrators are spoilt for choice when it comes to products proffering to optimize the utilization of their virtualized environments. What these products basically do is to use clever algorithms to stuff as many virtual machines as possible into a physical server so as to make full use of the underlying CPU, memory, I/O etc. The underlying assumption is that the fewer physical servers you have, the lower your datacenter costs. Seductive, but untrue. While there are many studies out there with slightly different numbers, most people will agree that power represents a large and growing slice of datacenter costs, and that the power consumed by CPUs and the associated cooling needed to keep them from blowing up in turn represents a significant portion of those power costs. Now overlay that with the fact that the power consumed by a CPU tends to increases exponentially with utilization, particularly at higher rates of utilization. What that implies is it could actually cost more to run one physical server at 100% CPU utilization than two physical servers at a lower CPU utilization, meaning that your fancy virtualization optimization software could inadvertently be increasing your datacenter costs. Perhaps an opportunity for an enterprising startup to bridge the server virtualization & power management worlds? Managing my meeting schedule is annoying, time consuming but necessary – sort of like filing my tax return. But unlike tax return software, which has significantly improved the user experience, scheduling software has barely evolved over the past decade. Scheduling a meeting sounds deceptively simple. It starts with one party making an offer of a date/time, then the other parties either accepting the offer or making counter-offers until an agreement is reached. In practice, it can get horribly messy, often devolving into a linear programming exercise conducted over email with complexity that increases exponentially with the number of parties. What if I could just hand all that unpleasant negotiation over to an automated agent? A request could take the form of a party or parties that I want to meet with, the mode of the meeting (in-person or remote) plus any restrictions around the latest time that the meeting must take place by. The agent could be subject to some personalized rules, such as the appropriate hours that meetings can take place, and the appropriate buffer time between meetings. A company that does some elements of this is TimeBridge – but I’ve encountered very few adopters, and it’s not encouraging that neither of their VCs actually use their service. One function that the agent could deliver is the ability to automatically reschedule a chain of meetings on the fly. So if say, my flight is delayed, I don’t have to shoot off a flurry of emails and calls to reconfigure my day. It may also maintain a cache of “nice-to-have” meetings that can be quickly slotted in should another meeting be unexpectedly canceled. Another function I would like is for the agent to be smart about is the location of the parties I want to meet. For a remote meeting, this includes taking into account time zone differences, thus eliminating common gaffes like the Israeli startup that recently invited me for a web conference at 4am my time. For an in-person meeting, this includes automatically grouping meetings in similar locations together so as to minimize travel time. Finally, it will be nice if the agent can be predictive, rather than reactive, to my needs. One example could be to suggest potential meetings to attend given where I’m planning to be. I’ve lost count of the number of times I was stuck cooling my heels in a coffee shop when I could have instead been meeting with an interesting startup or attending a networking event just round the corner. LuckyCal makes a valiant attempt at this, though it’s currently more focused on consumer rather than business scenarios. Fear of vendor lock-in is one of the most cited reasons by enterprises for staying out of the cloud. Predictably, this has led to a rash of cloud interoperability proselytizing, frequently invoking visions of cloud nirvana where applications flit effortlessly across multiple data centers and cloud service providers in obedience to some universal standards. Interoperability standards, however, merely make it possible to move stuff around. They do not make it practical to move stuff around. And one thing that is most certainly impractical to move around today is huge volumes of data, particularly if it’s tethered to production workloads. So for enterprises fidgeting over cloud vendor lock-in, thinking really hard about where & how they want to store their data will likely be far more productive than interoperability evangelism. And for analysts trying to predict which cloud service providers will dominate the landscape, a careful evaluation of how easy & economical they make uploading & managing data will likely be a sound leading indicator. Another potential casualty of the economic maelstrom are 11 public libraries in Philadelphia. This is not good. Many people I know, myself included, have benefited tremendously from public library infrastructure. Even if we choose to believe the (somewhat dubious) claims by the Kindle-clinging crowd that reading printed books will soon be consigned to a geriatric pastime, public libraries still provide a place for quiet study, for community gatherings and (ironically) internet access. Drastic times call for creative measures. Many public libraries sit on a rich vein of data – the borrowing history of their customers – that can be mined & monetized for delivering highly targeted ads. For example, people typically check out books like “What to Expect When You’re Expecting”, “Fodor’s Disneyland for Kids” or “The Official Guide for GMAT Review” for quite specific reasons. Also, hobbies such as canoeing, cooking or calligraphy can be easily discerned from borrowing patterns. That being said, there are many issues that will need to be addressed, such as ensuring advertisers do not get their hands on personally identifiable information (PII), and that regulations like COPPA are complied with. Ads can be delivered in a wide variety of form factors. They can be banners or text strings on the library website and/or online catalog. They can be appended to alerts that libraries send out to remind customers that a book is due, or that a hold is available for pickup. They can even take on more traditional forms like inserts in library newsletters, or get printed on the book check-out receipts. And as many advertising executives know, accurately targeted ads command correspondingly high CPMs. Lots of folks will probably squeal in horror at the very suggestion of such desecration of a hallowed institution. Though personally, I think that an ad-supported library is better than no library at all. If I were a stock analyst trying to predict the movement of the markets, an abundance of analytical tools lie at my feet. Similarly, if I were a geologist searching for oil, a meteorologist second guessing the weather, or even a marketer trying to decide if the next big ice cream flavor will be Kahlua Walnut Banana Chip or Coconut Cherry Chocolate Crunch, the shelves groan with a plethora of tools just begging to be put to work. However, if I were an IT manager seeking to make sense of the cascades of operations data spewing from all corners of my datacenter, the paucity of options is despairing. Certainly, there are point solutions out there which are quite good within their (very) narrow domain, such as a particular vendor’s products or a particular corner of the datacenter. And of course there’s Splunk, a search engine for operations logs, which is great for firefighting but unable to stop bad things from happening in the first place. What will be really cool is a holistic system that can real-time ingest and normalize operations data from all layers of the stack (including power equipment) & cross-correlate it to determine interdependencies. This will enable at least 2 things: higher capacity utilization since you can run things hotter if you can predict what’s going to happen, and reduced manual labor & downtime since automatic alerts and remediation can be executed the moment there’s the slightest whiff of trouble. If we believe that IT infrastructure will become increasingly centralized and industrialized, that implies datacenters are going to get larger and their innards stuffed with an ever shifting spectrum of heterogeneous technologies, a far more complex beast to tame. It’s unlikely that a patchwork of point solutions plus the raw muscle of manual labor is going to be up to the task. Perhaps a golden opportunity for some erstwhile hedge fund quant seeking their next challenge. The digital photos & videos I take over a weekend often exceed the total storage capacity of the PC I had back in my college days. This surging tide of consumer-created photos & videos hit a milestone in 2008, when for the first time the total capacity of consumer storage shipped exceeded that of enterprise storage (Morgan Stanley’s Internet Trends). Despite this burgeoning photo & video sprawl, a simple & effective way for consumers to backup all this stuff has proved elusive. The vast majority of people I know are one hard disk crash away from losing their digital memories. Permanently. Of course, there’s the old school solution, typically involving making manual regular backups onto an external hard drive or some kind of optical disk. The main problem here is that it takes discipline to do this, which most folks lack. A secondary problem is that external disks, and especially optical disks, can and do fail. While there are automated alternatives involving some variation of home file servers or even network-attached storage appliances, those generally exist in the province of geekdom and are well beyond the financial & technical means of the average consumer. Then there’s the (relatively) newer cloud storage alternatives. Mozy, Carbonite and Jungle Disk are just a few of a bewildering array of me-too service providers that have mushroomed over the past 3 years or so. The main problem here is turtlerisque upload speeds (at least in the US). And in many cases, this comes accompanied with performance degradation on your PC, rendering it impotent except for the most basic tasks. Yes, I know you can just leave your PC on overnight to complete the uploads, but that’s hardly a positive user experience. I wonder about the potential for some kind of hybrid solution. Say a smallish USB-attached appliance that automatically detects and grabs the appropriate new files from a PC, caches them on-board and then gradually bleeds them wirelessly up into the cloud. The price of the appliance can be kept low by subsidizing it from a recurring fee for the cloud storage service. Ctera has something similar in the market today, but it targets small business & isn’t consumer friendly. Rebit has a very consumer friendly appliance, but doesn’t have an attached cloud service. Oh if only they would make something together…
| View in Web Browser | /_layouts/images/ichtmxls.gif | /Blogs/yi-jian_ngo/_layouts/xlviewer.aspx?listguid={ListId}&itemid={ItemId}&DefaultItemOpen=1 | 0x0 | 0x1 | FileType | xlsx | 255 | | View in Web Browser | /_layouts/images/ichtmxls.gif | /Blogs/yi-jian_ngo/_layouts/xlviewer.aspx?listguid={ListId}&itemid={ItemId}&DefaultItemOpen=1 | 0x0 | 0x1 | FileType | xlsb | 255 | | Snapshot in Excel | /_layouts/images/ewr134.gif | /Blogs/yi-jian_ngo/_layouts/xlviewer.aspx?listguid={ListId}&itemid={ItemId}&Snapshot=1 | 0x0 | 0x1 | FileType | xlsx | 256 | | Snapshot in Excel | /_layouts/images/ewr134.gif | /Blogs/yi-jian_ngo/_layouts/xlviewer.aspx?listguid={ListId}&itemid={ItemId}&Snapshot=1 | 0x0 | 0x1 | FileType | xlsb | 256 |
|
|
|
|
Yi-Jian Ngo
Core Infrastructure, Security and Storage
I have a passion for technology and want to apply that towards discovering and developing ideas into successful companies. At AT&T Strategic Ventures, my investments included OpenClovis, a telecom middleware vendor. I have executed $15B worth of M&A transactions, as well as held multiple operating roles in network en...
More | Email
Recent Posts
Enterprise Cloud Storage
July 10, 2009
Shrinking the Search Market
May 30, 2009
A Cloudy Manifesto
April 22, 2009
Managing the Diversity of Desktop Virtualization
April 1, 2009
Can Virtualization Increase Power Costs?
March 27, 2009
Simplifying Scheduling
March 18, 2009
|
|
Featured Startup

The BizSpark startup of the day is Avetrium, based in Canada. You will find below an interview with Tim Smith, COO of Avetrium. All the best to them and congrats for being the startup of the day!
Read more...
|
|
|