November 24, 2015 § Leave a comment
I recently came across an article that contained five predictions around the sale of televisions from Consumer Reports for Black Friday: (http://www.consumerreports.org/cro/lcdtvs/5-predictions-for-black-friday-2015). While reading these predictions, a few caught my interest (especially price matching and pricing transparency). These predictions underline one of the driving forces in retail – how do retailers improve their margins and still remain competitive? While we commonly talk a lot about cloud being the platform to bring innovative capabilities to touch new clients and keep existing clients in retail, I’d like to focus on another benefit related to the cost savings that cloud can bring to retailers.
Let’s look at two dimensions related to cost: initial cost (to setup) and the ongoing costs or workloads. Most people understand that the initial costs should be lower leveraging cloud solution (e.g. a public cloud from IBM, Amazon, or Microsoft) versus investing in infrastructure in a traditional data center. However, many consider the ongoing costs for cloud to be more expensive. I’d like to explore this angle for a moment.
Looking at the traditional data center, there are four areas to explore related to ongoing costs and how they might change with cloud:
- Peak traffic and variability
- Cost of power/electricity usage
- Infrastructure labor costs
- Homogeneity vs. heterogeneity
Almost all retailers face the challenge around how to best handle peak traffic or huge variations in traffic. Whether it is black Friday, Valentine’s day, Mother’s day, or a movie lets out, many retailers have to plan how to handle these variations in traffic. Many organizations address this challenge by using the “high-water mark” principle. With this principle, you allocate the maximum computing capacity to deal with the maximum traffic and make this available all year long. With this approach, there are significant costs associated with keeping the infrastructure available whether it is being used or not. Being able to scale up capacity during these peak times and scale down afterwards is a classic cloud usage scenario that does result in reduced costs.
The cost of power is a metric we sometimes forget. Electricity usage is rising to the point where it is becoming the largest element of TCO for most data centers. It was estimated in 2013 that over 10% of the world’s electricity is consumed by IT. (http://www.theregister.co.uk/2013/08/16/it_electricity_use_worse_than_you_thought/). Many organizations use the PUE metric (power usage effectiveness) as a way to measure power efficiency in data centers. Unfortunately, much of the infrastructure in today’s data centers are obsolete, outdated, or unused and so power usage effectiveness tends to be much lower in traditionally owned data centers. Moving to a cloud based environment removes the burden of you having to measure PUEs, reduce your electricity consumption, and have a positive impact in the environment.
Have you looked at the infrastructure labor costs in your organization? I recently had the opportunity to look at a few cloud based data centers. One thing that impressed me was how few people you see. In the traditional data center, you typically see around 150 servers being managed by a single administrator. In a cloud based data center, the ratio changes to around 1000 servers per administrator. Automation is clearly a critical factor in optimizing labor and work and using your human resources for other activities.
A famous quote about the Model T from Henry Ford was that “any customer can have a car painted any color that he wants, so long as it is black.” Why did he say this? The Model T only came in black because the production line would slow down and have a negative impact on efficiency. There are similarities in cloud. With cloud, homogeneity has an impact in reducing cost. Think about the amount of costs expended because of differences in environments, platforms, and applications that run in your data center. Cloud offers great efficiency due to standardization, which will translate into cost savings. One will need to do the work to move these workloads so they can be executed on common platforms and detailed workload assessments can help.
My final thought struck me when I was in a shopping mall last week. I was making a purchase at a store that was using old point of sale devices. With retailers that have physical stores scattered across many locations, the associated costs for managing the IT infrastructure in each store represents an opportunity to leverage cloud to reduce costs as well as provide buyers a more delightful client experience.
November 20, 2015 § Leave a comment
The agile edge capability within a Hybrid IT model is what we are calling the environment that allows for both rapid experimentation and elastic scaling to cope with exponential adoption. It enables faster IT delivery and increasing innovation from both internal employees and external partners or ecosystems. This requires major change of culture, procedures and technologies. Enterprises will need to shift away from rigid methodologies and processes to those which enable agile and collaborative development (e.g. hackathons). Besides Platforms as a Services (PaaSs), this could include the use of IBM Design Thinking (learn more here) which is based on user focused development and highlights the user experience rather than the product itself though frequent updates and feature releases. Additionally, for an enterprise with mature Agile Edge capabilities, venture funding partnerships may be created which allow new growth models.
November 17, 2015 § Leave a comment
I have continued to pursue my interest in Big Data and Spark. As I mentioned before, Big Data University is where I ended up. I took the Big Data Spark Fundamentals course. It was a great overview of the capabilities of Spark. Here are some of my impressions.
- Data Source simplicity – It was amazing the amount of simplicity in the Spark programming model. Much of the elements needed to perform tasks are already setup and created for you. There are numerous existing Spark connectors that have been built to allow you to work on data from many sources. Remember, one of the benefits of Spark is that the programming model is the same regardless of the data source. Obviously, Hadoop provides the infrastructure to store massive amounts of data. But there are times when data exists in files or existing SQL databases. The connectors do all the hard work.
- Programming simplicity – All of the examples that the course took me through showed the tremendous power of Spark and how it can be achieved with an incredibly small amount of code. Being able to trudge through millions of records in a very short amount of time in a few lines of code is really amazing.
- It’s all about the API – The programming model is very simple, the concepts are fairly simple, but all the power of Spark is in understanding the API. The basics of transformations and actions are easy to understand, but knowing how to construct the transformations is the key. I suspect most people rely on really good examples. The open source community is much better at this today than in the past. I write very little original code these days and often borrow from good examples that are out there in the community.
- Data Scientist and Spark Developer – The magic here is the collaboration between the data scientist (the one who knows the data and the questions to ask) and the Spark developer (the one that translate that into Spark code). There is lots of business value power in this small collaborative team. You can envision a very large enterprise with large data and and relatively small team being able to garner immense business value using Spark.
Now that I am armed with my Big Data Spark Fundamentals badge from Big Data University, I am going to check out what else I can learn from this portal. You should too.
November 10, 2015 § Leave a comment
by Russell Hargraves and Sumit Patel
Nothing is more difficult to undertake, more perilous to conduct or more uncertain in its outcome, than to take the lead in introducing a new order of things. For the innovator has for enemies all those who have done well under the old and lukewarm defenders amongst those who may do well under the new. Niccolo Machiavelli (1523)
The world is entering into a new era of computing that will enable the digital transformation of society and business based on the advancement and personalization of cognitive computing. Cognitive computing systems learn and interact naturally with people to extend what either a human or machine could do on their own. Cognitive systems like IBM Watson are redefining society, business and human interaction in the increasingly pervasive digital economy by helping everyone and everything make better decisions.
Today, the world is being rewritten in software code igniting the explosion of big data enabled by apps, mobile devices, social networks and the internet of things (IoT) ushering in the new Cognitive era. The cloud and the emergence of the industrial hybrid cloud are the platforms on which the new digital builders, developers, business professionals, governments and individuals are reimagining everything from education, banking, retail, healthcare, transportation and beyond as seen in the figure below.
November 4, 2015 § Leave a comment
I have been a developer (at least at heart) all of my professional career. I have always found ways to keep my hands dirty in some type of coding effort. However, the older you get the more removed you become. Development is a young man’s game. However, I think I have found my next programming model playground.
If you haven’t noticed by now, Big Data is kind of a big deal. When you hear that things like “each day we produce 2.5 quintillion bytes of data” and “90% of the world’s data has been created in the last 2 years” it doesn’t take a genius to hypothesize that there might be some hidden value in all that data. It appears that the storage industry has no problem keeping up with this demand and networking is also getting better and better (I thought physics was involved but apparently we keep finding ways to get more through the same pipe).
So the problem to solve falls to the foot soldiers of every technical problem, the developers. And the default landscape that all developers maneuver and and work in is open source. The Apache Spark project is another great example of the open source ecosystem gaining unfathomably quick traction in solving a problem.
I got to go to the IBM Insight conference last week and I took that opportunity to learn some things (between my booth duty stints). There were many Spark sessions to attend but most were full. They turned away many people at many Spark-based sessions. And being an IBMer, I got sent to the end-of-the-line for any walk up lab spots. However, I was able to attend a few sessions and learn some good things. I am beginning my self education by taking some Big Data University courses online (check them out here.
I have spent a lot of my time in the developer community not only pushing out code, but also constantly tweaking and consulting on the interactions between developers and the rest of the larger team. Spark brings a new interesting dimension to this dynamic. As we talked about before, the reason we are here is that there is lots of value in all that data. So Spark was created as a programming model to make it simple to carry out highly compute-intensive data manipulation. The difference here is that someone typically asks a question that potentially has a simple answer, but getting it goes beyond the typical programming models/platforms that exist today.
Let’s examine some of the differences that this problem set brings to the party.
- The user interface is not important. We have spent so much effort in UI design, frameworks, etc. due to the boon of the mobile device. Application look-and-feel and user experience is so important in the mobile space due to the intense competition between vendors. In the big data space, the answer is really the only thing we care about.
- The requirements are simple and the results are typically simple, getting there is the hard part. The vast majority of the work done by Spark applications is the chunking of data. A very simple application (very few lines) can perform massive amounts of processing. The Spark platform is the ultimate effort in pushing all of the complexity below the development experience.
- Data scientists are the big data analysts. The role of the data scientist is the role that sits between the line-of-business and the developer. Data scientists know the data that is being captured and help translate the question being asked to the Spark developer. As a matter of fact, with tools like the Data Scientist Workbench, we are providing a platform for Data Scientists to learn enough about Spark to do the work themselves.
- The art is in understanding how Spark works and programming the effort accordingly. Understanding how Spark divides up the work, when and where to store intermediate data (if at all), and tuning the program accordingly is where a Spark developer brings the value. Programming a web application can be done with a single-user mentality in mind. Scaling the application can be tackled at other levels of the architecture. As I said before, the requirements for Big Data apps are typically simple and the answer is typically also simple, but the time it takes for the application to get to that answer is all that counts.
As I explore Spark more, I will keep you posted along the way. Let me know your thoughts.