Open
Description
Current Behaviour
I have following dataframe:
df:pyspark.sql.dataframe.DataFrame tpep_pickup_datetime:timestamp tpep_dropoff_datetime:timestamp trip_distance:double fare_amount:double pickup_zip:integer dropoff_zip:integer id:long
When I apply the following code:
` report = ProfileReport(
df,
title=dataset_name,
infer_dtypes=False,
interactions=None,
missing_diagrams=None,
correlations={
"auto": {"calculate": False},
"pearson": {"calculate": True},
"spearman": {"calculate": True},
},
)
# Convert profiling to JSON
profiling_json = json.loads(report.to_json())
report_html = report.to_html()
displayHTML(report_html)`
I am getting key error in the title. When I drop the date columns, I can reach HTML view.
Expected Behaviour
But I need date columns profiling HTML result, too. What should I do?
Data Description
This is my dataframe:
f:pyspark.sql.dataframe.DataFrame
tpep_pickup_datetime:timestamp
tpep_dropoff_datetime:timestamp
trip_distance:double
fare_amount:double
pickup_zip:integer
dropoff_zip:integer
id:long
Code that reproduces the bug
KeyError: 'n_invalid_dates'
table2 = Table(
62 [
63 {"name": "Minimum", "value": fmt(summary["min"]), "alert": False},
64 {"name": "Maximum", "value": fmt(summary["max"]), "alert": False},
65 {
66 "name": "Invalid dates",
---> 67 "value": fmt(summary["n_invalid_dates"]),
68 "alert": False,
69 },
70 {
71 "name": "Invalid dates (%)",
72 "value": fmt_percent(summary["p_invalid_dates"]),
73 "alert": False,
74 },
75 ],
pandas-profiling version
%pip install ydata-profiling==4.16.1
Dependencies
!pip install great_expectations==1.4.2
Python >=3.10
OS
No response
Checklist
- There is not yet another bug report for this issue in the issue tracker
- The problem is reproducible from this bug report. This guide can help to craft a minimal bug report.
- The issue has not been resolved by the entries listed under Common Issues.