How to merge tables?

Hi,

i have several tables in with following columns:

Product, Month

I have 1 table for each month, for example:

Product, January
Cookies, 15

and

Product, February
Cookies, 2

and

Product, March
Easter eags, 20

I would like to merge these tables in to:

Product, January, February, March
Cookies, 15, 2, 
Easter eggs, , , 20 

How would I do this best? I did not find a keyword in RPA.Tables. I thought I better ask before I start converting in to python dicts forth and back and probably go insane trying :wink:

Best regards,
Markus

That’s an interesting challenge. I don’t think Tables would easily handle that, since you’d have to merge tables by using product as a key/index. I was going to do some small changes to the library today anyways, so I think a specific Merge Tables could be added and be useful elsewhere.

3 Likes

Yes, you are right! I had not thought about that Product is a key column. That’s probably why pandas have the concept join instead of merge.

I would probably have to define in which columns I like to join tables and if I only want include products that exist in all tables (inner join) or also those exist in at least 1 table (outer join).

This is working for me:

from robot.api.deco import library, keyword
from RPA.Tables import Tables
from typing import List, Dict
from pandas import DataFrame

@library
class PandasLibrary:

    library = Tables()

    @keyword
    def join_dicts(self, data_sets: List[List[Dict]]) -> DataFrame:
        result: DataFrame = DataFrame.from_dict([])
        for data in data_sets:
            new_data = DataFrame.from_dict(data)
            if result.empty:
                result = new_data
            else:
                result = result.merge(new_data, how='outer')
        return result

    # keyword for converting pandas back to RPA.Tables
    @keyword
    def create_table_from_panda(self, data: DataFrame):
        records = data.to_dict(orient='records')
        return self.library.create_table(records)
    
    # pandas actually make nicer excel
    @keyword
    def save_panda_to_excel(self, data: DataFrame, excelfile: str, sheet_name='Seite 1'):
        data.to_excel(excelfile, sheet_name=sheet_name, index=False)

First keyword merges tables exactly like required. I used the second one for creating an excel file from the resulting table:

Convert panda to table and save in excel
   ${table}    PandasLibrary.create table from panda   ${PRODUKT_SUMMARY}
    RPA.Excel.Files.create workbook    ${OUTPUT_DIR}/produkte.xlsx
    RPA.Excel.Files.create worksheet    ProduktĂĽbersicht    ${table}
    RPA.Excel.Files.save workbook

But it is easier writing the pandas.DataFrame directly using the third keyword:

Save panda to excel
    PandasLibrary.save panda to excel    ${PRODUKT_SUMMARY}    ${OUTPUT_DIR}/panda.xlsx    ProduktĂĽbersicht

Maybe the first merging keyword is helpful for a RPA.Tables.merge keyword :slight_smile:

2 Likes

Nice!

I forgot to report back, but I added a simple merge tables keyword in the previous rpaframework release. It’s essentially a (version of) outer join; It joins all columns and then either a) appends all rows from all tables or b) merges rows based on a given column as key.

The example should look quite familiar: https://robocorp.com/docs/libraries/rpa-framework/rpa-tables/keywords#merge-tables

Tables itself doesn’t use pandas since it and its dependencies are quite large, and tables will be included everywhere.