Data Tools - WP Bolt

What is Hadoop good for?

It must be confusing if you are a Hadoop newbie at the moment. There are so many conflicting opinions about what it should be used for, as anecdotally demonstrated at the end of this beautifully written piece about the Big Data London event. At one end of the spectrum you have people saying, “it’s only […]

Glue Connector

The Glue connector is a metadata connector, which is used for querying and creating tables in AWS Glue. When you create an external table with this connector, if you give it the name of a table name already in Glue. The connector finds out the table’s column types, data location and storage format. The Kognitio […]

SYN cookies ate my dog – breaking TCP on Linux

TCP is supposed to guarantee that all bytes sent by one endpoint of a connection will be received, in the same order, by the other endpoint. In this article we’ll identify and demonstrate a wrinkle in the Linux implementation of TCP SYN cookies. The client can connect and send two packets, but the server’s TCP […]

SQL Commands: TO_TIME

The TO_TIME function converts a string in a given format to a Kognitio TIME data type. It will also accept a number instead of a string, within certain limits. It is possible to specify a literal string, a literal number, or a database column containing a string or number. In every case but one, their […]

Which companies are using Hadoop for big data analytics?

Forrester once predicted that enterprise adoption of Hadoop will become mandatory. While some companies are still struggling with their Hadoop projects, others are using the big data framework to revolutionize their data storage and analytics. The advantages of Hadoop — flexibility and lower costs — appeal to enterprises, so Hadoop has fundamentally changed how businesses […]

Hadoop compared with other technologies

What is the difference between Apache Spark™ and Apache™ Hadoop®? Spark is an open source, in-memory, parallel data processing engine. It is an alternative to MapReduce which is less widely used these days. Spark can run on Hadoop or on its own cluster. What is the difference between big data and Hadoop? Big data is […]

Set analysis with variables – dynamic secret sauces and meet your new best friend in Qlik Sense

I discovered set analysis with variables recently and thought it would be useful to share how I used them. Even if you know how to use them, check out the end of this blog where I use an extension to make variable changes more usable – your new best friend! This guide is applicable to […]

Hadoop Commands Cheat Sheet

This cheat sheet outlines some of the main Hadoop commands that we’ve found useful while building our Cloudways alternative hosting service. Generic hadoop fs -ls <path> list files in the path of the file system hadoop fs -chmod <arg> <file-or-dir> alters the permissions of a file where <arg> is the binary argument e.g. 777 hadoop […]

SQL Commands: LIKE and ILIKE

The predicates LIKE and ILIKE are used to search for strings that match a given pattern, so you can search or for a single word (or string) in a long text field. LIKE is case sensitive, ILIKE is case insensitive. You can use these SQL commands via phpMyAdmin on our Cloudways Alternative hosting plans. Usage […]