Saturday, October 13th, 2012 | Author:

Recently, Gartner, a research firm who specializes in IT consulting for large enterprises, issued a warning about OpenStack. It seems they’ve been getting a lot of questions from IT managers about this OpenStack thing and what to do about it. The warning was rather scathing, and sparked a response from an OpenStack proponent, Mirantis. In reading this response, I felt the need to respond to it, but it was too long for a comment. Note I’m not necessarily defending the original Gartner report, nor anyone else for that matter, but I think some of the Gartner points were missed and also wanted to address a few of the points made.

The Mirantis response can be found at http://www.mirantis.com/blog/gartner-cannot-tell-about-openstack-cloud/ for reference as I go through the items. To the author, I appreciate your post and understand that having a horse in the race makes this Gartner report seem rather scathing. You need to respond to it. However, I think you missed a few points on some of your responses.

1. “The key is that OpenStack is the only open source cloud community with meaningful involvement frommany commercial interests…”

OpenStack gets a lot of good press about their “partners”, when in reality those partners are doubling down and providing their solutions for multiple CMPs. Look at Nicira, CAStor, Brocade, NetApp, etc. There is a press release whenever one of these does something for OpenStack, even though they almost always simultaneously roll support into other CMPs for the same feature. If OpenStack is good at one thing, it’s self promotion. That’s really the essence of the Gartner report.

2. “It is definitely open and, at this point in time, it is as close to a standard as it gets in the open source cloud ecosystem. Making the leap from interoperability challenges to “not an open standard” is unjustified.”

OpenStack may be open in the strict sense, but it’s not a standard. Any old company can’t roll an API and dub it a standard.  Really the only thing that qualifies at the moment is AWS, because everyone is trying to be compatible.

3. “If I want to deploy OpenStack, I go to openstack.org, deploy it using freely available recipes from Puppet or Chef and I have myself a cloud. I can do it with almost any hypervisor, use almost any flavor of Linux as a host OS and run it pretty much on any hardware…”

The point here regarding vendor lock-in is that, as a customer of a company that uses OpenStack, it’s just as difficult to migrate out of OpenStack and into VMware as it is to migrate out of VMware and into OpenStack. Both from a standpoint of writing an application against the API and trying to port that application, and migrating infrastructure between. It may even be tough to migrate between two OpenStack vendors, depending on the features they’ve implemented and Corporations they’ve partnered with to provide the various OpenStack components. This isn’t necessarily a knock on OpenStack, just again that it’s not a standard and CIOs are perhaps getting the wrong impression by all of the recent OpenStack press.

4. “Beyond supporting Eucalyptus with a quote for their press release, Amazon doesn’t care about them. Eucalyptus would align with OpenStack marketing hype if it could; but it can’t” … “Don’t consider CloudStack. It will soon die.”…”CloudStack only runs on Ubuntu 10.04 host OS, which is 3 years old and doesn’t have driver support for some of the newer hardware. As time passes, this will only get worse.”

This point of yours, unfortunately, really undermines the validity of your whole post. It makes an otherwise level-headed post go off into the deep end, and loses much of the audience by making blatantly untrue remarks. This is the point where I stopped believing that you might know what you’re talking about.

5. “There have been only a few board meetings so far; all – very productive with no infighting. I can attest to that as someone, who personally participated in ALL meetings.”

I have no opinion on this, as I’m not sure what sort of inside information Gartner is referring to. I’d be inclined to believe you if you hadn’t already destroyed your reputation for being fair and truthful in #4.

6. “Stability is one big gray area and is a function of adoption, not time. Everybody knows that enterprise software sucks – commercial or open source. I can just as credibly make a claim that VMWare, Microsoft and Citrix are not stable and never will be.”

The point is that it’s not ready for prime time. OpenStack may be a great product, with great goals, but at this point it’s not easy to use without local expertise, it’s not stable enough to use without local expertise. VMware, AWS, and Microsoft, with all of their flaws, are far more stable. And far more importantly, there are plenty of experts.  You just have to understand that Gartner is trying to help management make decisions, and at this point it’s not a great idea to jump in on OpenStack if you want to deploy a production environment. That doesn’t mean OpenStack is bad, it’s just helping reality to settle in in the midst of the buzz.

7. “VMWare is strong in the legacy, datacenter automation market. I wrote about it as well. OpenStack is competing more with AWS, not VMWare in the new, disruptive “open-cloud market.” It is true that, for now, enterprises fail to see the difference between VMWare and OpenStack. Longer term – it will change.”

Sure, VMware is working for compatibility with multiple CMPs, and most CMPs aren’t exactly in the same market as these CMPs.

8. “Unlike with VMware or Microsoft, OpenStack is designed as a series of loosely coupled components that are easy to integrate with a variety of third party solutions and hardware platforms. The only reason why it doesn’t make sense to use OpenStack with commercial platforms like VMWare is because VMware’s hypervisor is only designed to work with VMware’s suite of products. “Maximizing interoperability…for multi-vendor substitution,” as suggested in this report is only possible with OpenStack and not commercial offerings or vendor centric, open source solutions.”

I don’t think this Gartner statement is attacking OpenStack, but just offering advice in general. I agree that the layers idea is strong in OpenStack, but it’s also the reason why you get lock-in. I can’t be confident that the application I develop against OpenStack will work between OpenStack vendors, because each component is so loosely coupled that individual OpenStack vendors will have built proprietary solutions. OpenStack is great if you want to build something out of it, but not so great for those downstream. Not sure I believe the “only possible with OpenStack” comment.

Category: Stuff  | Leave a Comment
Saturday, September 29th, 2012 | Author:

I’ve been spending a lot of time working on cloud solutions recently, and have run across this question countless times.  Some are worried about betting on the wrong horse, others already have a stake in it or are linked somehow to those who do, and some simply want to know which is best.  After having gotten to know both solutions, I think this is a short-sighted question, but I’d like to talk a little bit about them before explaining why. If you feel you already know the two well, then skip these sections for the tl;dr version.

CloudStack is the older of the two, and is undoubtedly more mature as of this writing.  It is a full stack solution, the download you get provides management, storage, host agent, EC2 API compatibility, and everything you need to manage a cloud infrastructure. From a development standpoint, there is a CloudStack API for remote control, as well as the ability to develop plugins to modify or add functionality to cloudstack.

It supports KVM, Xen, VMware, and Oracle VM. It has support for bridging, VLAN managementand direct network management, as well as recently added support for Nicira STT and plain GRE isolation for networking(both through openvswitch bridging). For VM disk storage, it allows NFS, Cluster LVM, Rados Block Device(Ceph), local storage, or any other storage that can be mounted as a ‘shared filesystem’, such as GFS or OCFS2. For backups, ISO images, VM templates, snapshots and such, there is a ‘secondary storage’, which is primarily NFS or OpenStack’s Swift object store.

It recently underwent a licensing change and is now under the Apache Foundation, and with that change, some of the components now need to be downloaded separately, such as VMware plugins and other components that aren’t license compatible. The developers are currently working to make that migration, and in the future you may see ‘oss’ and ‘non-oss’ builds for download or be directed in the documentation to fetch a specific package to include a particular functionality.  It of course has the backing of Citrix, is used by large companies like SoftLayer and GoDaddy down to small businesses.

OpenStack started out as a project whose main contributors were NASA and Rackspace. It is younger, but has recently had a lot of news buzz, and is even referred to by some as ‘the media darling’, if an open source project could be described as such. It is a more modular approach. It’s a group of services; the storage service, compute service, network service, and a web dashboard. As such, the installation and management is a bit more complicated, as each service is set up and configured independently, but it’s also a potential bonus in that each service can be swapped out independently with various solutions. Each service also has its own API and could be deployed independently or incorporated into a separate project.

It supports xapi and libvirt based virtualizations solutions, meaning Xen, KVM, LXC, and VMware to the extent in which libvirt is capable. The current networking service has support for linux bridging and VLAN isolation, and a new service called Quantum is in the works which will support SDN based openvswitch isolation like Nicira.  For VM volume storage, it relies largely on iSCSI, and the storage service has the capability of managing many iSCSI targets, as well as Ceph, local and NFS support for VM volumes. There is also an object storage service called Swift, which can be used for generic data storage, and is even run and sold as a separate service by some.

OpenStack is largely backed and used by RackSpace. It recently lost NASA support, but has more than made up for that in good publicity. It has yet to gain a wide install base as of this writing, though there are many contributors and people playing with it if the mailing lists and blogs are any indication.

So back to the original question: Who is going to win?  My prediction is that neither will “win”; this simply isn’t a one-winner game.  Who won between Xen and VMware? Who won between RedHat and Ubuntu?  For all of the posturing that some might do, and the holy war attitudes that grow up around technologies, in the end there is usually room for more than one product. CloudStack seems to be maintaining the lead at the moment, and tomorrow maybe OpenStack will get ahead, there may even be a 70/30 split, but the companies involved and the scale of the market indicate that there will be active development and support for both platforms. This has been evidenced to me in just the past few months, where I’ve seen companies scrambling to provide support for their stuff in both platforms. Companies like Brocade, Nicira/VMware, Netapp, are actively supporting and developing code for both platforms in hopes of driving cloud customers to use their products.

Even if it were a one-winner contest, depending on your size you may be able to influence the race. If you see a tug-of-war in process, with ten men on either side, and you’ve got a team of five behind you, do you need to stop to consider which team you should support in order to back the winner?  Some individuals I’ve heard asking this question should really be asking “which technology do I want to help win?” instead of thinking that they’re an inactive bystander in the equation.

In all of the back and forth, and daily news about who is building what with the backing of which companies, one thing is certain. Cloud infrastructure as a service is going to be a hit.

Tuesday, May 15th, 2012 | Author:

I recently purchased two new 27″ displays from a Korean Ebayer. You can read all about them here and here.  Basically they’re the same panels that are in the nice Apple displays, but they didn’t quite make the cut for one reason or another, so they got sent to obscure Korean manufacturers who sell them in monitors for 1/4 the price.

I purchased Sunday afternoon for $319 each, shipping included, and 48 hours later they arrived. They both look remarkably good, I really had to hunt to find any discernible defect. One has two tiny stuck pixels and the other I *think* is just a hair dimmer than the other, which seemed to go away via brightness adjustment.

Really, though, this post is about how I got them to play nicely with the Nvidia TwinView on my Ubuntu desktop. You see, with the nouveau driver that is loaded by default, one worked fine, but to get the full acceleration and TwinView, I had to install the nvidia module, and for some reason it didn’t want to properly retrieve the monitor’s EDID. The result was a flickering 640 x 480 display; not pretty.

In troubleshooting, I noticed that ‘xrandr –prop’ would get the EDID nicely, but tools like get-edid from the read-edid package would return 128 bits worth of ’1′s and complain of a corrupt EDID. X seemed to pick up the proper one when running the nouveau driver, and not when running the nvidia driver.

So I fired up a hex editor and pasted the EDID as reported via xrandr, all 128 bytes, and added a custom EDID file to my xorg.conf so the nvidia driver would work with the dual displays.

You can add the following to the screen section of /etc/X11/xorg.conf, just under metamodes or wherever you prefer.

Option “CustomEDID” “DFP:/etc/X11/shimian-edid.bin”

Note you can also do semicolon delimited for multiple displays (or so I’ve read):

Option “CustomEDID” “DFP-0:/etc/X11/shimian-edid.bin; DFP-1:/etc/X11/catleap.bin”

I’m including the QH270 EDID .bin file here, in case anyone is desperately looking for it or having a hard time creating one. It should be similar or even work as a drop-in replacement for the Catleap Q270, aside from the Achieva models.

Category: Stuff  | 21 Comments
Friday, January 20th, 2012 | Author:

Here’s a quick weigh-in on the new experimental device-mapper thin provisioning (and improved snapshots) that exists in linux kernel 3.2. I recently compiled a kernel and tested it, and it looks rather promising. I would expect this to become stable more quickly than, say, btrfs (which obviously has different design goals but could also be used as a means of snapshotting files/vm disks). With any luck, LVM will be fitted for support relatively soon.

These are quick and dirty, sequential write test was with ‘dd if=/dev/zero of=testfile bs=1M count=16000 conv=fdatasync’, and results are in MB/s.  Random IO test was done with fio:

[global]
ioengine=libaio
iodepth=4
invalidate=1
direct=1
thread
ramp_time=20
time_based
runtime=300

[8RandomReadWriters]
rw=randrw
numjobs=8
blocksize=4k
size=512M

 

At any rate, it looks like we’re well on our way to high performance LVM snapshots that work well  (finally!).

 

 

Category: Stuff  | Leave a Comment
Thursday, December 01st, 2011 | Author:

Version 0.9 gains some functionality that allows it to be used as a quick and dirty random I/O generator. Seekmark does a good job of pounding your disk as hard as possible in all-or-nothing fashion, but now you can specify a delay to insert between seeks to reduce the load, in order to simulate some scenario. For example you may want to test performance of some other application while the system is semi-busy doing random I/O on a database file, or you may want to test shared storage between multiple hosts where one host has 4 processes doing 64k random reads every 20ms and another host has 2 processes, one doing busy 4k random writes as fast as possible and the other doing 128k reads every 50ms.

With this, is a new -e option, that runs seekmark in endless mode. That is, it will simply run until it’s killed.

As usual, see the seekmark page, linked at the top of the blog.

Category: SeekMark  | 12 Comments
Wednesday, August 24th, 2011 | Author:

Ceph, an up and coming distributed file system, has a lot of great design goals. In short, it aims to distribute both data and metadata among multiple servers, providing both fault tolerant and scalable network storage. Needless to say, this has me excited, and while it’s still under heavy development, I’ve been experimenting with it and thought I’d share a few simple benchmarks.

I’ve tested two different ‘flavors’ of Ceph, the first I believe is referred to as “Ceph filesystem”, which is similar in function to NFS, where the file metadata (in addition to the file data) is handled by remote network services and the filesystem is mountable by multiple clients. The second is a “RADOS block device”, or RBD. This refers to a virtual block device that is created from Ceph storage. This is similar in function to iSCSI, where remote storage is mapped into looking like a local SCSI device. This means that it’s formatted and mounted locally and other clients can’t use it without corruption (unless you format it with a cluster filesystem like GFS or OCFS).

If you’re wondering what RADOS is, it’s Ceph’s acronym version of RAID. I believe it stands for “Reliable Autonomous Distributed Object Store”. Technically, the Ceph filesystem is implemented on top of RADOS, and other things are capable of using it directly as well, such as the RADOS gateway, which is a proxy server that provides object store services like that of Amazon’s S3. A librados library is also available that provides an API for customizing your own solutions.

I’ve taken the approach of comparing cephfs to nfs, and rbd to both iscsi and multiple iscsi devices striped over different servers. Mind you, Ceph provides many more features, such as snapshots and thin provisioning, not to mention the fault tolerance, but if we were to replace the function of NFS we’d put Ceph fs in its place; likewise if we replaced iSCSI, we’d use RBD. It’s good to keep this in mind because of the penalties involved with having metadata at the server; we don’t expect Ceph fs or NFS to have the metadata performance of a local filesystem.

  • Ceph (version 0.32) systems were 3 servers running mds+mon services. These were quad core servers, 16G RAM. The storage was provided by 3 osd servers (24 core AMD box, 32GB RAM, 28 available 2T disks, LSI 9285-8e), each server used 10 disks, one osd daemon for each 2T disk, and an enterprise SSD partitioned up with 10 x 1GB journal devices. Tried both btrfs and xfs on the osd devices, for these tests there was no difference. CRUSH placement defined that no replica should be on the same host, 2 copies of data and 3 copies of metadata. All servers had gigabit NICs.
  • Second Ceph system has monitors, mds, and osd all on one box. This was intended to be a more direct comparison to the NFS server below, and used the same storage device served up by a single osd daemon.
  • NFS server was one of the above osd servers with a group of 12 2T drives in RAID50 formatted xfs and exported.
  • RADOS benchmarks ran on the same two Ceph systems above, from which a 20T RBD device was created.
  • ISCSI server was tested with one of the above osd servers exporting a 12 disk RAID50 as a target.
  • ISCSI-md was achieved by having all three osd servers export a 12 disk RAID50 and the client striping across them.
  • All filesystems were mounted noatime,nodiratime whether available or not. All servers were running kernel 3.1.0-rc1 on centos 6. Benchmarks were performed using bonnie++, as well as a few simple real world tests such as copying data back and forth.

ceph-nfs-iscsi-benchmarks.ods

The sequential character writes were cpu bound on the client in all instances; the sequential block writes (and most sequential reads) were limited by the gigabit network. The Ceph fs systems seem to do well on seeks, but this did not translate directly into better performance in the create/read/delete tests. It seems that RBD is roughly in a position where it can replace iSCSI, but the Ceph fs performance needs some work (or at least some heavy tuning on my part) in order to get it up to speed.

It will take some digging to determine where the bottlenecks lie, but in my quick assessment most of the server resources were only moderately used, whether it be the monitors, mds, or osd devices. Even the fast journal SSD disk only ever hit 30% utilization, and didn’t help boost performance significantly over the competitors who don’t rely on it.

Still, there’s something to be said for this, as Ceph allows storage to fail, be dynamically added, thin provisioned, rebalanced, snapshots, and much more, with passable performance, all in pre-1.0 code.  I think Ceph has a big future in open source storage deployments, and I look forward to it being a mature product that we can leverage to provide dynamic, fault-tolerant network storage.

 

 

 

 

 

 

 

 

 

 

Category: Stuff  | 2 Comments
Friday, June 03rd, 2011 | Author:

I just updated SeekMark to include a write seek test. I was initially reluctant to do this, because nobody would ever want to screw up their filesystem by performing a random write test to the disk it resides on, right?? Of course not, but occasionally you need to benchmark a disk, for the sake of benchmarking, and aren’t worried about the data. And of course, I didn’t care about that functionality until I needed it myself!

So here we have version 0.8 of SeekMark, which adds the following features:

  • write test via “-w” flag, with a required argument of “destroy-data”
  • allows for specification of io size via the “-i” flag, from 1byte to 1048576 bytes (1 megabyte). The intended purpose of the benchmark (which is to test max iops and latency) is still best fulfilled by the default io size of 512, but changing the io size can be useful in certain situations.
  • added “-q” flag per suggestions, which skips per-thread reporting and limits output to the result totals and any errors that possibly arise

Now head on over to the SeekMark page and get it!

Category: Stuff  | Leave a Comment
Monday, February 14th, 2011 | Author:

High quality, dual screen 3840×1080 desktop backgrounds, fresh from my fractmark utility. These are large PNG files, because I’m a sucker for detail.

-y -.6003 -x-.367 -X -.3658 -l 60000

Wednesday, February 09th, 2011 | Author:

I just added a new page for SeekMark, a little program that I put together recently to test the number of random accesses/second to disk. It’s threaded and will handle RAID arrays well, depending on the number of threads you select. I’m fairly excited about how this turned out, it helped me prove someone wrong about whether or not a particular RAID card did split seeks on RAID1 arrays. The page is here, or linked to at the top of my blog, for future reference.  I’d appreciate hearing results/feedback if anyone out there gives it a try.

Here are some of my own results, comparing a linux md raid10, 5 disk array against the underlying disks. I’ll also show the difference in the results that threading the app made:

single disk, one thread:

  [root@server mlsorensen]# ./seekmark -t1 -f/dev/sda4 -s1000
  Spawning worker 0
  thread 0 completed, time: 13.46, 74.27 seeks/sec

  total time: 13.46, time per request(ms): 13.465
  74.27 total seeks per sec, 74.27 seeks per sec per thread

single disk, two threads:

  [root@server mlsorensen]# ./seekmark -t2 -f/dev/sda4 -s1000
  Spawning worker 0
  Spawning worker 1
  thread 0 completed, time: 27.29, 36.64 seeks/sec
  thread 1 completed, time: 27.30, 36.63 seeks/sec

  total time: 27.30, time per request(ms): 13.650
  73.26 total seeks per sec, 36.63 seeks per sec per thread

Notice we get pretty much the same result, about 74 seeks/sec total.

5-disk md-raid 10 on top of the above disk, one thread:

  [root@server mlsorensen]# ./seekmark -t1 -f/dev/md3 -s1000
  Spawning worker 0
  thread 0 completed, time: 13.09, 76.41 seeks/sec

  total time: 13.09, time per request(ms): 13.087
  76.41 total seeks per sec, 76.41 seeks per sec per thread

Still pretty much the same thing. That’s because we’re reading one small thing and waiting for the data before continuing. Our test is blocked on a single spindle!

four threads:

  [root@server mlsorensen]# ./seekmark -t4 -f/dev/md3 -s1000
  Spawning worker 0
  Spawning worker 1
  Spawning worker 2
  Spawning worker 3
  thread 1 completed, time: 15.02, 66.57 seeks/sec
  thread 2 completed, time: 15.46, 64.69 seeks/sec
  thread 3 completed, time: 15.57, 64.24 seeks/sec
  thread 0 completed, time: 15.69, 63.74 seeks/sec

  total time: 15.69, time per request(ms): 3.922
  254.96 total seeks per sec, 63.74 seeks per sec per thread

Ah, there we go. 254 seeks per second. Now we’re putting our spindles to work!

Category: Stuff  | 18 Comments
Tuesday, February 08th, 2011 | Author:

I’ve just added a page for FractMark, a simple multi-threaded fractal-based benchmark. Read more about it (and download it) here.

On a side note, some of you may be familiar with a similarly simple i/o benchmark called PostMark. It was written under contract for Network Appliance  and is known as an easy, portable, random i/o generator.  At this point the sourcode has been pretty much abandoned as far as I can tell, so I’ve picked it up and have begun adding some bugfixes as well as some enhancements. The primary things I’ve done so far are to add an option for synchronous writes  in Linux, as well as threaded transactions, which should give people the flexibility to test scenarios where they might have many processes creating random i/o.

If this interests you, I’ll be posting the source code and patches coming soon!

Category: Stuff  | Leave a Comment