Phrama Clinical Trials

This section covers a very unique dataset that tags clinical trials with their predicted outcome success.

Data is updated weekly on Fridays as is made available from regulatory filers

Dataset contains 850+ tickers, available from 1999-11-01 onwards.

Tutorials are the best documentation — Clinical Trials Tutorial

Input Datasets

Regulatory Filings; Biochemical Data

Models Used

Deep Learning Encoders; Langauge Models

Model Outputs

Success prediction; Expected duration

Description

We predict the success of a clinical trial, its duration, and the expected economic impact, including potential market reactions, using state-of-the-art machine learning models. Our solution also provides detailed metadata about each trial that allowed us to predict regulatory phase success and/or approval rate, empowering users to anticipate outcomes with greater accuracy.

Achieving an impressive 87% ROC-AUC—the highest among commercially available solutions—clients can rely on our predictions to make informed decisions. With an average of 1,052 new clinical trials launched each week, our platform lets you screen and focus on the most promising opportunities.

Data Access

Prediction Data:

import sovai as sov
df_clinical = sov.data("clinical/predict", full_history=True)

Description Data

import sovai as sov
df_clinical = sov.data("clinical/trials", full_history=True)

Accessing Specific Tickers

You can also retrieve data for specific tickers. For example:

import sovai as sov
df_pfizer = sov.data("clinical/predict", tickers=["PFE"])

Data Dictionary

Column Name

Description

ticker

Stock ticker symbol of the company

date

Date the complaint was received

company

Name of the company the complaint is against

bloomberg_share_id

Bloomberg Global Share Class Level Identifier

culpability_score

Score indicating the company's culpability in the complaint

complaint_score

Score based on the severity of the complaint

grievance_score

Score based on the grievance level of the complaint

total_risk_rating

Overall risk rating combining culpability, complaint, and grievance scores

product

Financial product related to the complaint

sub_product

Specific sub-category of the financial product

issue

Main issue of the complaint

sub_issue

Specific sub-category of the issue

consumer_complaint_narrative

Narrative description of the complaint provided by the consumer

company_public_response

Public response provided by the company

state

State where the complaint was filed

zip_code

ZIP code of the consumer

Use Cases

Risk Assessment: Evaluate the risk profile of financial institutions based on complaint data.
Consumer Sentiment Analysis: Analyze consumer sentiment towards different financial products and companies.
Regulatory Compliance: Monitor compliance issues and identify potential regulatory risks.
Product Performance Evaluation: Assess the performance and issues related to specific financial products.
Competitive Analysis: Compare complaint profiles across different financial institutions.
Geographic Trend Analysis: Identify regional trends in financial complaints.
Customer Service Improvement: Identify areas for improvement in customer service based on complaint types and resolutions.
ESG Research: Incorporate complaint data into Environmental, Social, and Governance (ESG) assessments.
Fraud Detection: Identify patterns that might indicate fraudulent activities.
Policy Impact Assessment: Evaluate the impact of policy changes on consumer complaints over time.

The resulting dataset provides a comprehensive view of consumer complaints in the financial sector, enabling detailed analysis of company performance, consumer issues, and regulatory compliance.

PreviousCFPB Complaints NextRequest Datasets

Last updated 2 months ago

Was this helpful?