The Common Solvent for REST APIs – O’Reilly

0
1

389b

389b

389b 389b
389b 389b 389b
389b 389b 389b
389b 389b 389b
389b 389b 389b
389b 389b

389b Knowledge scientists working in Python 389b or R sometimes purchase information 389b by means of REST APIs. 389b Each environments present libraries that 389b provide help to make HTTP 389b calls to REST endpoints, then 389b remodel JSON responses into dataframes. 389b However that’s by no means 389b so simple as we’d like. 389b While you’re studying a number 389b of information from a REST 389b API, you’ll want to do 389b it a web page at 389b a time, however pagination works 389b in another way from one 389b API to the subsequent. So 389b does unpacking the ensuing JSON 389b constructions. HTTP and JSON are 389b low-level requirements, and REST is 389b a loosely-defined framework, however nothing 389b ensures absolute simplicity, by no 389b means thoughts consistency throughout APIs.

389b

389b What if there have been 389b a means of studying from 389b APIs that abstracted all of 389b the low-level grunt work and 389b labored the identical means in 389b all places? Excellent news! That’s 389b precisely what  389b Steampipe 389b  does. It’s a instrument that 389b interprets REST API calls instantly 389b into SQL tables. Listed here 389b are three examples of questions 389b you can ask and reply 389b utilizing Steampipe.

389b

389b

389b

389b
389b Study sooner. Dig deeper. 389b See farther.
389b

389b

389b

389b

389b 1. Twitter: What are latest 389b tweets that point out PySpark?

389b

389b Right here’s a SQL question 389b to ask that query:

389b

 389b choose
  id,
  textual  389b content
from
  twitter_search_recent
the place
   389b question = 'pyspark'
order by
   389b created_at desc
restrict 5;

389b

389b Right here’s the reply:

389b

 389b +---------------------+------------------------------------------------------------------------------------------------>
| id     389b       389b       389b     |  389b textual content     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   >
+---------------------+------------------------------------------------------------------------------------------------>
| 1526351943249154050 |  389b @dump Tenho trabalhando bastante com  389b Spark, mas especificamente o PySpark.  389b Vale a pena usar um  389b >
| 1526336147856687105 | RT @MitchellvRijkom:  389b PySpark Tip ⚡    389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   >
|    389b       389b       389b       389b    |   389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     >
|  389b       389b       389b       389b       389b | When to make use  389b of what StorageLevel for Cache  389b / Persist?     389b       389b       389b       389b       389b       389b       389b       389b       389b  >
|     389b       389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b    >
|   389b       389b       389b       389b     |  389b StorageLevel decides how and the  389b place information ought to be  389b s…      389b       389b       389b       389b       389b       389b       389b       389b    >
| 1526322757880848385  389b | Remedy challenges and exceed  389b expectations with a profession as  389b a AWS Pyspark Engineer. https://t.co/>
|  389b 1526318637485010944 | RT @JosMiguelMoya1: #pyspark  389b #spark #BigData curso completo de  389b Python y Spark con PySpark  389b       389b >
|      389b       389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   >
|    389b       389b       389b       389b    | https://t.co/qf0gIvNmyx  389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b  >
| 1526318107228524545 | RT  389b @money_personal: PySpark & AWS: Grasp  389b Huge Knowledge With PySpark and  389b AWS      389b       389b       389b       389b >
|      389b       389b       389b       389b  | #ApacheSpark #AWSDatabases #BigData  389b #PySpark #100DaysofCode     389b       389b       389b       389b       389b       389b       389b   >
|    389b       389b       389b       389b    | ->  389b http…      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     >
+---------------------+------------------------------------------------------------------------------------------------>

389b

389b The desk that’s being queried 389b right here,  389b twitter_search_recent 389b , receives the output from 389b Twitter’s  389b /2/tweets/search/latest 389b  endpoint and formulates it as 389b a desk with  389b these columns 389b . You don’t should make 389b an HTTP name to that 389b API endpoint or unpack the 389b outcomes, you simply write a 389b SQL question that refers back 389b to the documented columns. A 389b kind of columns,  389b question 389b , is particular: it encapsulates 389b Twitter’s  389b question syntax 389b . Right here, we’re simply 389b in search of tweets that 389b match  389b PySpark 389b  however we might as simply 389b refine the question by pinning 389b it to particular customers, URLs, 389b varieties ( 389b is:retweet 389b 389b is:reply 389b ), properties ( 389b has:mentions 389b 389b has_media 389b ), and so forth. That 389b question syntax is identical regardless 389b of the way you’re accessing 389b the API: from Python, from 389b R, or from Steampipe. It’s 389b loads to consider, and all 389b you need to actually need 389b to know when crafting queries 389b to mine Twitter information.

389b

389b 2. GitHub: What are repositories 389b that point out PySpark?

389b

389b Right here’s a SQL question 389b to ask that query:

389b

 389b choose 
  identify, 
  389b  owner_login, 
  stargazers_count  389b 
from 
  github_search_repository 
the  389b place 
  question =  389b 'pyspark' 
order by stargazers_count desc  389b 
restrict 10;

389b

389b Right here’s the reply:

389b

 389b +----------------------+-------------------+------------------+
| identify     389b       389b       389b    | owner_login  389b       389b  | stargazers_count |
+----------------------+-------------------+------------------+
| SynapseML  389b       389b       389b  | microsoft    389b       389b  | 3297    389b       389b       389b |
| spark-nlp     389b       389b    | JohnSnowLabs  389b       389b | 2725     389b       389b     |
|  389b incubator-linkis      389b | apache     389b       389b    | 2524  389b       389b       389b   |
| ibis   389b       389b       389b       389b | ibis-project     389b   | 1805   389b       389b       389b  |
| spark-py-notebooks    389b | jadianes     389b       389b  | 1455    389b       389b       389b |
| petastorm     389b       389b    | uber  389b       389b       389b    | 1423  389b       389b       389b   |
| awesome-spark   389b       389b  | awesome-spark    389b   | 1314   389b       389b       389b  |
| sparkit-learn    389b       389b | lensacom     389b       389b  | 1124    389b       389b       389b |
| sparkmagic     389b       389b   | jupyter-incubator |  389b 1121      389b       389b    |
| data-algorithms-book  389b | mahmoudparsian     389b | 1001     389b       389b     |
+----------------------+-------------------+------------------+

389b

389b This appears to be like 389b similar to the primary instance! 389b On this case, the desk 389b that’s being queried,  389b github_search_repository 389b , receives the output from 389b GitHub’s  389b /search/repositories 389b  endpoint and formulates it as 389b a desk with  389b these columns 389b .

389b

389b In each circumstances the Steampipe 389b documentation not solely reveals you 389b the schemas that govern the 389b mapped tables, it additionally offers 389b examples ( 389b Twitter 389b 389b GitHub 389b ) of SQL queries that 389b use the tables in numerous 389b methods.

389b

389b Be aware that these are 389b simply two of many obtainable 389b tables. The Twitter API is 389b mapped to  389b 7 tables 389b , and the GitHub API 389b is mapped to  389b 41 tables 389b .

389b

389b 3. Twitter + GitHub: What 389b have homeowners of PySpark-related repositories 389b tweeted these days?

389b

389b To reply this query we 389b have to seek the advice 389b of two totally different APIs, 389b then be a part of 389b their outcomes. That’s even tougher 389b to do, in a constant 389b means, if you’re reasoning over 389b REST payloads in Python or 389b R. However that is the 389b sort of factor SQL was 389b born to do. Right here’s 389b one option to ask the 389b query in SQL.

389b

 389b -- discover pyspark repos
with github_repos  389b as (
  choose 
  389b    identify, 
  389b    owner_login, 
  389b    stargazers_count 
  389b  from 
    389b  github_search_repository 
  the  389b place 
     389b question = 'pyspark' and identify  389b ~ 'pyspark'
  order by  389b stargazers_count desc 
  restrict  389b 50
),

-- discover twitter handles of  389b repo homeowners
github_users as (
   389b choose
    u.login,
  389b    u.twitter_username
   389b from
    github_user  389b u
  be a part  389b of
    github_repos  389b r
  on
    389b  r.owner_login = u.login
   389b the place
     389b u.twitter_username isn't null
),

-- discover corresponding  389b twitter customers
  choose
   389b   id
  from
  389b    twitter_user t
  389b  be a part of
  389b    github_users g
  389b  on
     389b t.username = g.twitter_username
)

-- discover tweets  389b from these customers
choose
  t.author->>'username'  389b as twitter_user,
  'https://twitter.com/' ||  389b (t.author->>'username') || '/standing/' || t.id  389b as url,
  t.textual content
from
  389b  twitter_user_tweet t
be a part  389b of
  twitter_userids u
on
   389b t.user_id = u.id
the place
   389b t.created_at > now()::date - interval  389b '1 week'
order by
  t.writer
restrict  389b 5

389b

389b Right here is the reply:

389b

 389b +----------------+---------------------------------------------------------------+------------------------------------->
| twitter_user   |  389b url      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     |  389b textual content     389b       389b       389b       389b       389b       389b    >
+----------------+---------------------------------------------------------------+------------------------------------->
| idealoTech  389b     |  389b https://twitter.com/idealoTech/standing/1524688985649516544      389b | Can you discover inventive  389b soluti>
|      389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b  >
|     389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b | Be part of our  389b @codility Order #API Challe>
|   389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b     >
|  389b       389b       389b       389b |      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b    | #idealolife  389b #codility #php     389b       389b  >
| idealoTech    389b   | https://twitter.com/idealoTech/standing/1526127469706854403   389b    | Our  389b #ProductDiscovery staff at idealo>
|   389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b     >
|  389b       389b       389b       389b |      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b    | Assume  389b you possibly can clear up  389b it? 😎     389b       389b  >
|     389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b | ➡️  https://t.co/ELfUfp94vB https://t>
|  389b ioannides_alex | https://twitter.com/ioannides_alex/standing/1525049398811574272 | RT  389b @scikit_learn: scikit-learn 1.1 i>
|   389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   | What's new?  389b You possibly can examine the  389b releas>
|      389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b  >
|     389b       389b       389b   |    389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b | pip set up -U…  389b       389b       389b       389b       389b >
| andfanilo     389b   | https://twitter.com/andfanilo/standing/1524999923665711104   389b     |  389b @edelynn_belle Thanks! Typically it >
|  389b andfanilo      389b  | https://twitter.com/andfanilo/standing/1523676489081712640    389b    | @juliafmorgado  389b Good luck on the reco>
|  389b       389b       389b       389b |      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b    |   389b       389b       389b       389b       389b       389b       389b       389b >
|      389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     |  389b My recommendation: energy by way  389b of it + a lifeless>
|  389b       389b       389b       389b |      389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b    |   389b       389b       389b       389b       389b       389b       389b       389b >
|      389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b     |  389b I hated my first few  389b brief movies bu>
|    389b       389b       389b    |   389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b  |     389b       389b       389b       389b       389b       389b       389b    >
|   389b       389b       389b     |  389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b       389b   | Wanting ahead  389b to the video 🙂

389b

389b When APIs frictionlessly turn out 389b to be tables, you possibly 389b can dedicate your full consideration 389b to reasoning over the abstractions 389b represented by these APIs. Larry 389b Wall, the creator of Perl, 389b famously mentioned: “Straightforward issues ought 389b to be straightforward, exhausting issues 389b ought to be potential.” The 389b primary two examples are issues 389b that ought to be, and 389b are, straightforward: every is simply 389b 10 traces of straightforward, straight-ahead 389b SQL that requires no wizardry 389b in any respect.

389b

389b The third instance is a 389b tougher factor. It will be 389b exhausting in any programming language. 389b However SQL makes it potential 389b in a number of good 389b methods. The answer is manufactured 389b from concise stanzas (CTEs, Frequent 389b Desk Expressions) that type a 389b pipeline. Every part of the 389b pipeline handles one clearly-defined piece 389b of the issue. You possibly 389b can validate the output of 389b every part earlier than continuing 389b to the subsequent. And you 389b are able to do all 389b this with probably the most 389b mature and widely-used grammar for 389b choice, filtering, and recombination of 389b knowledge.

389b

389b Do I’ve to make use 389b of SQL?

389b

389b No! In case you like 389b the thought of mapping APIs 389b to tables, however you’d relatively 389b motive over these tables in 389b Python or R dataframes, then 389b Steampipe can oblige. Underneath the 389b covers it’s Postgres, enhanced with  389b overseas information wrappers 389b  that deal with the API-to-table 389b transformation. Something that may connect 389b with Postgres can connect with 389b Steampipe, together with SQL drivers 389b like Python’s  389b psycopg2 389b  and R’s  389b RPostgres 389b  in addition to business-intelligence instruments 389b like Metabase, Tableau, and PowerBI. 389b So you should use Steampipe 389b to frictionlessly devour APIs into 389b dataframes, then motive over the 389b information in Python or R.

389b

389b However in case you haven’t 389b used SQL on this means 389b earlier than, it’s price a 389b glance. Contemplate this comparability of 389b SQL to Pandas from  389b Find out how to rewrite 389b your SQL queries in Pandas 389b .

389b

389b SQL 389b Pandas
389b choose * from airports 389b airports
389b choose * from airports restrict 389b 3 389b airports.head(3)
389b choose id from airports the 389b place ident = ‘KLAX’ 389b airports[airports.ident == ‘KLAX’].id
389b choose distinct kind from airport 389b airports.kind.distinctive()
389b choose * from airports the 389b place iso_region = ‘US-CA’ and 389b sort = ‘seaplane_base’ 389b airports[(airports.iso_region == ‘US-CA’) & (airports.type 389b == ‘seaplane_base’)]
389b choose ident, identify, municipality from 389b airports the place iso_region = 389b ‘US-CA’ and sort = ‘large_airport’ 389b airports[(airports.iso_region == ‘US-CA’) & (airports.type 389b == ‘large_airport’)][[‘ident’, ‘name’, ‘municipality’]]

389b

389b We are able to argue 389b the deserves of 1 fashion 389b versus the opposite, however there’s 389b no query that SQL is 389b probably the most common and 389b widely-implemented option to specific these 389b operations on information. So no, 389b you don’t have to make 389b use of SQL to its 389b fullest potential with the intention 389b to profit from Steampipe. However 389b you may discover that you 389b simply need to.

389b

389b 389b 389b 389b 389b

389b

LEAVE A REPLY

Please enter your comment!
Please enter your name here