Inner Join
Let’s say we have these two tables:
-
The
Persontable:Name Surname FiscalCode Age Gender Domiziano Scarcelli 001 22 M Mario Rossi 002 35 M Lucia Del Perro 003 22 F Luciano Marchesi 004 55 M Lucia Gasparri 005 22 F -
And the
JobtableFCPerson IDJob Salary 001 2 23000 002 1 40000 003 4 22000
And we want to find the salary of each person that has a job. We need to combine the rows where Person.FiscalCode is equal to the Job.FCPerson. We can do that with a join.
SQL
-- Implicit join
SELECT *
FROM Person p, Job j
WHERE p.FiscalCode = j.FCPerson;
-- Explicit join
SELECT *
FROM Person p
JOIN PersonJob j
ON p.FiscalCode = j.FCPerson;This query returns the Cartesian product between the two tables, filtering for those rows that satisfy the join condition.
| Name | Surname | FiscalCode | Age | Gender | FCPerson | IDJob | Salary |
|---|---|---|---|---|---|---|---|
| Domiziano | Scarcelli | 001 | 22 | M | 001 | 2 | 23000 |
| Mario | Rossi | 002 | 35 | M | 002 | 1 | 40000 |
| Lucia | Del Perro | 003 | 22 | F | 003 | 4 | 22000 |
The condition p.FiscalCode = j.FCPerson is essential; otherwise, the system would return all possible pairs between the rows of the tables, including pairs of rows corresponding to data from two different people.
Note
The join condition connects the Foreign Key of one table to the Primary Key of the other.**
Pandas
In Pandas, we can use the DataFrame.merge() or the DataFrame.join() functions to do a join.
With .merge() the result is straight-forward.
result = person.merge(job, left_on='FiscalCode', right_on='FCPerson')With .join, we first have to set the indexes of the columns we want to join using set_index(), then perform the join, and do a reset_index in order to make the FiscalCode index a column again.
person_indexed = person.set_index('FiscalCode')
job_indexed = job.set_index('FCPerson')
result = person_indexed.join(job_indexed, how='inner').reset_index()Left Join
todo: modifica in modo da avere pandas + tabelle aggiornate in inglese + risultati corretti
The Left Join performs the same operation as the classic Join, but takes all the rows from the first table, including those that do not have a corresponding match in the “right” table (i.e., the table that appears second in the query) according to the imposed condition.
Unlike the inner join, it only uses explicit notation.
SELECT *
FROM Person p
LEFT JOIN Job j
ON p.FiscalCode = j.FCPerson;This query returns the Cartesian product between the Persona and PersonaLavoro tables, filtering for those rows that satisfy the condition Person.FiscalCode = Job.FCPerson, but also including those people who do not have a job, i.e., who do not have a corresponding entry in Job.FCPerson.
In other words, it takes the inner join between the tables, but then appends also the rows in the left which key do not appear in the right table.
| Name | Surname | FiscalCode | Age | Gender | FCPerson | IDJob | Salary |
|---|---|---|---|---|---|---|---|
| Domiziano | Scarcelli | 001 | 22 | M | 001 | 2 | 23000 |
| Mario | Rossi | 002 | 35 | M | 002 | 1 | 40000 |
| Lucia | Del Perro | 003 | 22 | F | 003 | 4 | 22000 |
| Luciano | Marchesi | 004 | 55 | M | NULL | NULL | NULL |
| Lucia | Gasparri | 005 | 22 | F | NULL | NULL | NULL |
Pandas
In Pandas, this can be achieved with .merge() using the parameter how='left':
result_left = person.merge(job, how='left', left_on='FiscalCode', right_on='FCPerson')Right Join
Opposite to the left join, the right join performs a classic join but also returns all those rows from the right table (the second) that do not have a corresponding match in the left table (the first) according to the imposed condition.
SELECT *
FROM Job j
RIGHT JOIN Person p
ON p.FiscalCode = j.FCPerson;| FCPerson | IDJob | Salary | Name | Surname | FiscalCode | Age | Gender |
|---|---|---|---|---|---|---|---|
| 001 | 2 | 23000 | Domiziano | Scarcelli | 001 | 22 | M |
| 002 | 1 | 40000 | Mario | Rossi | 002 | 35 | M |
| 003 | 4 | 22000 | Lucia | Del Perro | 003 | 22 | F |
| NULL | NULL | NULL | Luciano | Marchesi | 004 | 55 | M |
| NULL | NULL | NULL | Lucia | Gasparri | 005 | 22 | F |
Pandas
Same as the left join, we can use the merge(how="right") to perform the right join in Pandas:
result_right = person.merge(job, how='right', left_on='FiscalCode', right_on='FCPerson')