Pydantic 是目前最流行的 Python 数据验证工具
- Python 作为一种动态类型语言,在开发和易用性上存在优势
- 因此也需要对程序进行功能更强大的类型检查和数据验证
Pydantic 的特点
- 自定义与扩展:可用于任意 Python 对象的数据类型验证,支持嵌套结构
- 验证的灵活性:类型丰富,验证的时间点灵活,严格模式 vs 宽松模式
- 序列化:Pydantic 对象支持序列化和反序列化为字典和 JSON 字符串
- 高性能:核心验证逻辑基于 Rust 编写,性能出色且可靠,支持高吞吐量
- 生态完善:是很多流行库(FastAPI,LangChain 等)的依赖,社区活跃
严格模式 vs 宽松模式 严格模式 :仅当验证值属于相应类型或该类型的子类型时(
StrictBool
、StrictBytes
、StrictFloat
、StrictInt
、StrictStr
),这些类型才会通过验证 宽松模式:将传入数据强制转换为正确的类型,兼容性更强
示例 1:Pydantic 模型与验证器
from datetime import date
from uuid import UUID, uuid4
from enum import Enum
from pydantic import BaseModel, EmailStr
class Department(Enum):
HR = "HR"
SALES = "SALES"
IT = "IT"
ENGINEERING = "ENGINEERING"
# Pydantic 模型是一个类似于 Python 数据类的对象
# 类内部会定义和存储带有注释字段的实体数据
class Employee(BaseModel):
"""验证员工信息的 Pydantic 模型"""
employee_id: UUID = uuid4()
name: str = Field(min_length=1, frozen=True) # frozen:实例后内容不可修改
email: EmailStr # 自带邮件格式的验证器
# email: EmailStr = Field(pattern=r".+@example\.com$") # 正则方式的验证器
date_of_birth: date = Field(alias="birth_date", repr=False) # 别名,不显示
salary: float = Field(alias="compensation", gt=0, repr=False) # 数字必须大于0
department: Department # 嵌套模式
elected_benefits: bool
@field_validator("date_of_birth")
@classmethod # 验证器示例:根据出生日期,判断员工必须满18岁
def check_valid_age(cls, date_of_birth: date) -> date:
today = date.today()
eighteen_years_ago = date(today.year - 18, today.month, today.day)
if date_of_birth > eighteen_years_ago:
raise ValueError("Employees must be at least 18 years old.")
return date_of_birth
@model_validator(mode="after") # 在实例化后再进行验证
def check_it_benefits(self) -> Self:
department = self.department
elected_benefits = self.elected_benefits
# IT 部门雇佣的都是合同工,因此没有资格享受福利
if department == Department.IT and elected_benefits:
raise ValueError(
"IT employees are contractors and don't qualify for benefits"
)
return self
# Pydantic 模型的实例化
Employee(
name="Chris DeTuma",
email="[email protected]",
date_of_birth="1998-04-02",
salary=123_000.00,
department="IT",
elected_benefits=True,
)
# 根据字典实例化 Employee 对象
new_employee_dict = {
"name": "Chris DeTuma",
"email": "[email protected]",
"date_of_birth": "1998-04-02",
"salary": 123_000.00,
"department": "IT",
"elected_benefits": True,
}
Employee.model_validate(new_employee_dict)
model_validator
的参数mode
,有 2 种模式: (1)before
:在默认验证之前验证数据,一般更常用 (2)after
:在默认验证之后验证数据,需要将对象作为self
传入
示例 2:装饰器来验证函数的参数
import time
from typing import Annotated
from pydantic import PositiveFloat, Field, EmailStr, validate_call
@validate_call
def send_invoice(
client_name: Annotated[str, Field(min_length=1)],
client_email: EmailStr,
items_purchased: list[str],
amount_owed: PositiveFloat,
) -> str:
email_str = f"""
Dear {client_name}, \n
Thank you for choosing xyz inc! You
owe ${amount_owed:,.2f} for the following items: \n
{items_purchased}
"""
print(f"Sending email to {client_email}...")
time.sleep(2)
return email_str
@validate_call
装饰器虽然不如BaseModel
灵活,但依然能对函数参数应用强大的验证;这样可以节省大量时间,并避免编写样板类型检查和验证逻辑
示例 3:验证和集成环境变量
# 先导入环境变量
export DATABASE_HOST="http://somedatabaseprovider.us-east-2.com"
export DATABASE_USER="username"
export DATABASE_PASSWORD="asdfjl348ghl@9fhsl4"
export API_KEY="ajfsdla48fsdal49fj94jf93-f9dsal"
from pydantic import HttpUrl, Field
from pydantic_settings import BaseSettings, SettingsConfigDict
# BaseSettings 基类,会尝试读取对应参数关键词的环境变量
class AppConfig(BaseSettings):
database_host: HttpUrl
database_user: str = Field(min_length=5)
database_password: str = Field(min_length=10)
api_key: str = Field(min_length=20)
AppConfig()
# AppConfig(
# database_host=Url('http://somedatabaseprovider.us-east-2.com/'),
# database_user='username',
# database_password='asdfjl348ghl@9fhsl4',
# api_key='ajfsdla48fsdal49fj94jf93-f9dsal'
# ) # 输出预览
class AppConfig(BaseSettings):
# 方式2:用 SettingsConfigDict 从 .env 文件读取环境变量
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=True,
extra="forbid",
)
database_host: HttpUrl
database_user: str = Field(min_length=5)
database_password: str = Field(min_length=10)
api_key: str = Field(min_length=20)
参考:
Pydantic: Simplifying Data Validation in Python
A Practical Guide to using Pydantic