-
Recent Posts
Top Posts & Pages
- About
- Contact
- A Deep Dive into DynamoDB Partitions
- Filling the BigQuery paddling pool from the Kinesis Hosepipe
- The most important thing when picking HTTP status codes
- Orchestrating Tasks Using AWS SWF
- The Emergence of The 3 Towers: DevSecOps
- Shine hosts a successful Digital Leaders Breakfast
- Creating a serverless ETL nirvana using Google BigQuery
- Pablo rocking the stage at Google's annual cloud event!
Tags
- 5G
- A/B testing
- accessibility
- accessors
- Adobe
- Adobe Experience Manager
- AEM
- Agile
- Amazon
- amazon cognito
- angular
- angularjs
- appstore
- Architecture
- aria
- asynchronous
- Asynchronous Programming
- Awards
- aws
- Babel
- Backbone
- backbone.js
- backbonejs
- battlehack
- bem
- Best Practices
- Big Data
- BigQuery
- bluetooth
- BQ
- braintree
- Broadband
- Chart.js
- CI
- citcon-anz-2010
- Cloud
- Cloud Dataflow
- Cluster Computing
- Clustering
- CMS
- co
- code
- codedojo
- Commenting
- community
- comparison
- concordion
- Conference
- continuous delivery
- CoreLocation
- couchdb
- couchtato
- cq5
- css
- cucumber
- customer experience
- CX
- Database
- Databases
- delegate
- Design
- devops
- DevOpsSec
- DI
- dirty checking
- Distruptive
- Documentation
- DynamoDB
- dzone
- ec2
- emberjs
- empathy
- Enterprise
- es6
- ES2015
- ETL
- fiber
- firebase
- fitbit
- fullcalendar
- GCP
- generators
- gigabit
- Git
- Gitflow
- Google BigQuery
- Google Cloud
- google cloud dataflow
- Google Cloud Datastore
- Google Cloud Platform
- Google Cloud Storage
- Google Developer Expert
- Google Maps
- Google Pub/Sub
- GPS
- grunt-clientlibify
- gruntjs
- hackathon
- hacking
- Hadoop
- hardware
- html5
- HTTP
- hudson
- human-centered design research
- IaaS
- identity map
- idm
- in-app
- Innovation Session
- Internet
- Internet of Things
- iOS
- iPhone
- itunes
- jackson
- Java
- JavaOne
- javascript
- JavaServer Pages
- jenkins
- jhipster
- jmeter
- jquery
- jsconfau
- jslint
- json
- JSP
- key value observer
- kids coding
- kinesis
- koa
- lambdas
- lessons
- Location Data
- Loon
- MapReduce
- Mass Insertion
- melbjs
- meteor
- MFA
- microservices
- Mobile
- Model–view–controller
- monitoring
- Multi -Factor-Authentication
- netbiscuits
- NewRelic
- News
- ngaria
- node
- node.js
- node.js
- NoSQL
- notification
- oam
- Objective-C
- OpenCms
- Open source
- oracle access manager
- osdc2103
- pair-programming
- participatory design
- partitions
- patterns
- payments
- paypal
- Performance
- PHP
- phpunit
- Pipelining
- Play
- Play Framework
- preferred vendors
- Programming
- Projections
- Promises
- Provisioned IOPS
- purchasing
- Python
- qualitative
- R
- Rails
- rds
- re: Invent
- React
- ReactiveCocoa
- React Native
- Redis
- Redshift
- release management
- Replication
- REST
- RESTful
- Risk Management
- Ruby
- s3
- Scala
- SDK
- Security
- Sensis
- service design
- Silex
- Simple Workflow
- Simulate
- single page app
- single page apps
- sinon.js
- site monitoring
- Software Development
- Software Engineering
- spa
- Spark
- Speed
- spring
- Spring Data
- Spring Data REST
- Spring Framework
- Spring Security
- SQL
- Streaming
- streams
- styleguide
- SWF
- swift
- swipeconference
- tech
- Telstra
- testflightapp
- testing
- TLS/SSL
- Tomcat
- Touch ID
- tutorial
- UDF
- UI
- Unit Testing
- usability
- user defined functions
- ux
- wds2013
- web
- web directions
- Web Directions Code
- XCode
- YARN
- YouTube
- YOW!
Meta
Tag Archives: Spark
Google Cloud Dataproc and the 17 minute train challenge
My work commute My commute to and from work on the train is on average 17 minutes. It’s the usual uneventful affair, where the majority of people pass the time by surfing their mobile devices, catching a few Zs, or by reading a … Continue reading
Posted in DevOps, Linux, Opinion, Tools
Tagged Big Data, Cloud, Cluster Computing, Clustering, Google BigQuery, Google Cloud, google cloud dataflow, Google Cloud Platform, Google Cloud Storage, Hadoop, IaaS, Java, MapReduce, Python, R, Scala, Software Engineering, Spark, YARN
2 Comments