12 Feb Pyxis Research Study Program
After 2 years of hard work, we at Pyxis celebrate the first master’s thesis about age prediction in social media, made with the support from our Postgraduate Study Program.
We invite you to know more details about the program, how was the process, what was the investigation about and which were the results.
Pyxis Research Study Program
The Pyxis Research Study Program idea we promoted from Pyxis Research is to facilitate those who work at Pyxis the preparation of their postgraduate programs as well as doctorates, providing them with support. The applicant must have a subject of his or her own interest, that is also an interest for Pyxis and gives the environment an added value.
The promotion of postgraduate studies has several objectives, such as encouraging the continuous training of the people that are part of Pyxis, encouraging the integration with the academy and improving the acquisition of cutting- edge knowledge by the entire Pyxis ecosystem.
The support consists of providing a set of working hours the student can use to its convenience. The number of hours depends on the workload that each has inside Pyxis, as well as the duration of the study program and the student’s level of progress at the moment of granting the scholarship.
The development process that a thesis demands is not linear. Some moments require more dedication than others. To compile that with the work in software- which is normally irregular- where the projects have demand peaks, is a big challenge.
First project: Age Prediction in Social Media
For the first project, we wanted a thesis subject that had certain impact and interest inside Pyxi’s area of work. The idea of doing something about natural language processing and big data arose.
This thesis objective was to explore the problem of predicting age to complete a person’s profile, in the particular context of the region with a strong specialization in the spanish language, for which we have an important data corpus and analysis tools of proven practical utility.
The main idea was analyzing different approaches and verifying what was useful. In comparison with other similar works where a 48% of success was reached, this project reached 63%.
Other important aspect to mention about the project is the fact that there aren’t many similar experiments in spanish. This implied a data corpus in spanish. Most of the existing papers and studies are in english.
We can say that we are approaching. We explored several options and made progress. We also know that any study, whether it is in spanish or english, doesn’t exceed 70% of success.