Big Data

Today’s business organizations are collecting vast amounts of data – varying in structure, complexity and size. However, one thing all these organizations are discovering is that a wealth of strategic value that lies in this data is difficult to extract using traditional relational database management tools. Besides strong technical competency, using big data tools requires a fundamental shift in how organizations view their data, its structure, usage scenarios and the roles & responsibilities of the IT and user organizations.

Our Big Data practice assists our clients in both strategic upstream activities such as evaluating and developing big data road-map to implementation and support of large environments.

Our Advisory Services include:

  • Identifying/defining Big Data business/project initiatives
  • Developing a Big Data implementation road-map
  • Creating proof of concepts, white papers, technology / tool evaluation services
  • Providing a road-map to help clients choose appropriate technologies / frameworks / tools
  • Implementing best-practices and industry standards
  • Implementing new tools, technologies to provide innovative solutions

Our Execution Services include:

  • Planning, design and implementation of a Hadoop and other Big Data environments
  • Developing/enhancing Java or C++ or LAMP based applications on existing or new Hadoop implementations
  • Troubleshooting/performance optimization of existing Hadoop implementations
  • Data quality management and data harmonization projects
  • Testing/QA of big data applications, automation of data validations and regression test scenarios
  • Documentation, programmer trainings, reverse-engineering, upgrade, maintenance, migration and other steady-state services

Below are some of the technologies our Big Data practice works with

 
Programming LanguagesJava, Python, JavaScript (client-side as well as NodeJS)
Distributed File SystemsApache Hadoop HDFS, Tachyon,
Key/Value Data StoresApache Accumulo, BerkleyDB, MemcachedDB, Redis, Amazon DynamoDB
Column-oriented Data StoresApache Hive, Apache Hbase, Apache Cassandra, Amazon Redshift
Document-oriented Data StoresMongoDB, CouchDB, Riak, RethinkDB
Graph-oriented Data StoresApache Giraph, Neo4J, Blueprints, OrientDB, GraphX
Relational Data StoresOracle, MySQL, PostgreSQL, MariaDB, Greenplum, Teradata, BlinkDB, Shark
Search PlatformsApache Solr, Elastic Search, GSA
Text ProcessingApache Tika, Apache Mahout, Apache Stanbol
In-memory/Realtime ProcessingApache Spark, Apache Spark Streaming, Apache Storm
Statistics, VisualizationGnuplot, VizQL (Tableau), D3JS, Leaflet (maps)
Cloud PlatformsAmazon AWS, OpenStack