Okay, so I was checking out this website, ResearcherID, and I created this page for Darrell Ulm: http://www.researcherid.com/rid/Y-5083-2018. It seems like another really useful site for listing research work, much like ORCID, which you can see here for Darrell Ulm as well: https://orcid.org/0000-0002-0513-0416 . I'm still trying to fully understand the nuances between ResearcherID and ORCID, as they appear to be quite similar in their aim – providing a unique identifier for researchers and their publications. However, looking at Darrell Ulm's ResearcherID page, it seems to have some interesting connections to other resources, specifically mentioning reviewing efforts. It's fascinating to see how these platforms are interconnected and how they contribute to the broader ecosystem of scholarly communication and recognition. I need to explore further how these different systems integrate and what unique benefits each offers to researchers like Darrell. It’s all part of navigating the evolving landscape of research visibility.
Made post at Databricks forum, thinking about how to take two DataFrames of the same number of rows and combine, merge, all columns into one DataFrame. This is straightforward, as we can use the monotonically_increasing_id() function to assign unique IDs to each of the rows, the same for each Dataframe. It would be ideal to add extra rows which are null to the Dataframe with fewer rows so they match, although the code below does not do this. Once the IDs are added, a DataFrame join will merge all the columns into one Dataframe. # For two Dataframes that have the same number of rows, merge all columns, row by row. # Get the function monotonically_increasing_id so we can assign ids to each row, when the # Dataframes have the same number of rows. from pyspark.sql.functions import monotonically_increasing_id #Create some test data with 3 and 4 columns. df1 = sqlContext.createDataFrame([("foo", "bar","too","aaa"), ("bar...