Big Data, it’s been around a while, used heavily and pretty much the norm for most companies.  So why a blog about Big Data? Well, this blog is not entirely about Big Data, but more about how AWS can offer the agility businesses need today to move fast and be competitive.

Business electricity supplier - Haven Power - approached KCOM with a challenge; to design, build and deploy a repeatable Big Data/Business Intelligence solution hosted on AWS. Where’s the challenge in that you ask?

The solution must be deployed within a 2-week period to meet a customer need - challenge accepted!

Haven Power’s developers had already written various SQL queries needed to produce the desired end report from the solution, but help and advice with formalising cloud infrastructure elements in a secure, optimised and repeatable manner was needed. KCOM’s approach to all solutions is ensuring AWS best practices are followed, providing customers with a solid foundation to host cloud workloads.

Design patterns were applied to accelerate the design process, enabling the fast tracking of describing the various AWS components in CloudFormation.

  • Isolating VPC hosted resources to in their prescribed subnets ensuring private resources stay private and are not exposed to potential internet security threats.
  • The use of Bastion EC2 instances for private connectivity to the environment, pre-hardened KCOM AMI with certificate based authentication including O/S and Session based audit logging.
  • Multi-node/Multi-az deployment of both Redshift and RDS instances giving the required performance for the solution with high-levels of resiliency, both encrypted at rest with AWS KMS.
  • Provision of a BI solution called Matillion from the AWS Marketplace utilising out the box workflows processing of jobs. The workflows within the Matillion are powerful and interface directly with AWS database and many other decoupled AWS services.
  • All Instance based resource wrapped around is tight security group model.

So, a little bit on how one of the Matillion workflows work:

  • S3 data ingress bucket triggers a Lambda function when a new object is uploaded.
  • The lambda function sends a message to an SQS queue containing metadata about the object.
  • Matillion then consumes SQS messages, processes the metadata and imports the new S3 data objects in to RedShift.
  • When the import is complete, Matillion updates MDM (Master Data Management) from RDS SQL into Redshift, Matillion executes several queries against the updated dataset on Redshift to produce a nice report which is stored on an egress S3 bucket.
  • Matillion then triggers an SNS message to an SNS topic which is relayed to subscribed recipients letting them know a new report has been generated.

Sounds neat eh?

Well, you may be thinking “this all sounds very achievable in 10 days”? What I omitted to mention was the solution needed to be deployed into 2 different stages of the development pipeline with various degrees of testing per environment before deployment to production.

Nathan Stone from Haven Power commented “KCOM have been instrumental in helping us to launch and then mature our BI/Analytics platform” adding “Initial consultations with KCOM Consultants and Architects helped us to refine and validate our approach against best practice. We were heavily supported throughout our early proof-of-concept stages and KCOM helped us produce an architecture utilising many AWS Services such as Multi-zone RDS instance, Redshift cluster, Matilion AWS Marketplace and AWS Lambda. KCOM Consultants continue to act as mentors, while ensuring that the infrastructure we spin up is reliable, secure, maintainable and future-proof.KCOM’s early insistence on using CloudFormation to define our AWS accounts has already paid service delivery dividends; and continues to give us confidence that we have a stable, well-documented platform on which to support the business”

Because of the automated nature of provisioning AWS infrastructure, any iterations can be quickly tweaked, re-deployed, re-tested and validated before deployment to the next environment. When deployment to production is complete, the source CloudFormation template is now a very powerful artefact which Haven Power can re-use giving them that competitive edge in a very challenging market.

Big data, Cloud, Agility, Intelligence, AWS