AWS Glue is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It was introduced in August 2017.[2]
Developer(s) | Amazon.com |
---|---|
Initial release | August 2017[1] |
Operating system | Cross-platform |
Available in | English |
Website | aws |
Overview
editThe primary purpose of Glue is to scan other services[3] in the same Virtual Private Cloud (or equivalent accessible network element even if not provided by AWS), particularly S3.[citation needed] The jobs are billed according to compute time, with a minimum count of 1 minute.[4] Glue discovers the source data to store associated meta-data (e.g. the table's schema of field names, types lengths) in the AWS Glue Data Catalog (which is then accessible via AWS console or APIs).[5]
Languages supported
editScala and Python are officially supported as of 2020[update].[6]
Catalog interrogation via API
editThe catalog can be read in AWS console (via browser) and via API divided into topics including:[7]
- Database API
- Table API
- Partition API
- Connection API
- User-Defined Function API
- Importing an Athena Catalog to AWS Glue
See also
editReferences
edit- ^ "Introducing AWS Glue: A Simple, Flexible, and Cost-Effective Extract, Transfer, and Load (ETL) Service".
- ^ "AWS Services List". ParkMyCloud. Retrieved October 6, 2020.
- ^ "AWS Glue: crawlers and use cases". 5 January 2022. Retrieved July 13, 2022.
- ^ "AWS Glue version 2.0 featuring 10x faster job start times and 1-minute minimum billing duration". AWS. August 10, 2020. Retrieved October 6, 2020.
- ^ "AWS Glue API Documentation". AWS. Retrieved October 6, 2020.
- ^ "AWS Glue Now Supports Scala in Addition to Python". AWS. January 12, 2018. Retrieved October 6, 2020.
- ^ "Catalog API". AWS. Retrieved October 8, 2020.