> 技术文档 > mongodb源代码分析createCollection命令由create.idl变成create_gen.cpp过程

mongodb源代码分析createCollection命令由create.idl变成create_gen.cpp过程

mongodb命令db.createCollection(name, options)创建一个新集合。由于 MongoDB 在命令中首次引用集合时会隐式创建集合,因此此方法主要用于创建使用特定选项的新集合。

例如,您使用db.createCollection()创建:固定大小集合;集群化集合;使用模式验证的新集合。

db.createCollection() 方法具有以下原型形式:

db.createCollection( , { capped: , timeseries: {  // Added in MongoDB 5.0 timeField: , // required for time series collections metaField: , granularity: , bucketMaxSpanSeconds: , // Added in MongoDB 6.3 bucketRoundingSeconds:  // Added in MongoDB 6.3 }, expireAfterSeconds: , clusteredIndex: , // Added in MongoDB 5.3 changeStreamPreAndPostImages: , // Added in MongoDB 6.0 size: , max: , storageEngine: , validator: , validationLevel: , validationAction: , indexOptionDefaults: , viewOn: , pipeline: , collation: , writeConcern:  } )

db.createCollection()参数解释:

参数 类型 说明 capped Boolean 是否为固定大小集合(默认falsesize Number 固定集合的最大大小(字节),仅在capped=true时有效 max Number 固定集合的最大文档数量 validator Document JSON Schema 验证器,确保文档符合特定格式 storageEngine Document 存储引擎特定配置(如 WiredTiger 参数) indexes Array 创建集合时预定义的索引 writeConcern Document 默认写关注级别 readConcern Document 默认读关注级别 autoIndexId Boolean 是否自动为_id字段创建索引(默认trueviewOn String 创建视图时指定源集合 pipeline Array 视图的聚合管道 collation Document 指定排序规则(如区分大小写) timeseries Document 时间序列集合配置 expireAfterSeconds Number TTL 索引,指定文档自动过期时间(秒)

mongodb源代码src\\mongo\\db\\commands文件夹下面是命令文件所在地:

count_cmd.cpp封装count命令,distinct.cpp封装了distinct命令,dbcommands.cpp封装了CmdCreate和CmdDrop、CmdDatasize等。CmdCreate封装了创建collection过程。CreateCommand解析create命令,重点是CreateCommand怎么来的?工具跟踪进去是create_gen.cpp。create_gen.cpp原来是create.idl。

/* create collection */class CmdCreate : public BasicCommand {public: CmdCreate() : BasicCommand(\"create\") {} virtual bool run(OperationContext* opCtx,  const string& dbname,  const BSONObj& cmdObj,  BSONObjBuilder& result) { IDLParserErrorContext ctx(\"create\"); CreateCommand cmd = CreateCommand::parse(ctx, cmdObj); ... }} cmdCreate;

create.idlidl文件是什么?

MongoDB 采用 IDL(接口定义语言)生成 C++ 代码是一种常见的工程实践,减少样板代码,提高开发效率,避免手动编写重复逻辑(如字段提取、类型检查、错误处理),确保代码一致性(所有命令遵循相同的验证规则)。

mongo\\db\\commands\\create.idl内容是:

global: cpp_namespace: \"mongo\"imports: - \"mongo/idl/basic_types.idl\"commands: create: description: \"Parser for the \'create\' Command\" namespace: concatenate_with_db cpp_name: CreateCommand strict: true fields: capped: description: \"Specify true to create a capped collection. If you specify true, you must also set a maximum size in the \'size\' field.\" type: safeBool default: false autoIndexId: description: \"Specify false to disable the automatic creation of an index on the _id field.\" type: safeBool optional: true idIndex: description: \"Specify the default _id index specification.\" type: object optional: true size:  ...

create.idl怎么转换成create.cpp的呢?

在 buildscripts 有一个目录 idl,这里负责根据 src 中的 idl 生成文件。其中主要看buildscripts/idl/idl/generator.py文件,根据cpp_name生成对应的cpp文件,其中有一段逻辑:

def generate(self, spec): # type: (ast.IDLAST) -> None ... spec_and_structs = spec.structs spec_and_structs += spec.commands for struct in spec_and_structs: self.gen_description_comment(struct.description) with self.gen_class_declaration_block(struct.cpp_name):  self.write_unindented_line(\'public:\')  # Generate a sorted list of string constants  self.gen_string_constants_declarations(struct)  self.write_empty_line()  # Write constructor  self.gen_class_constructors(struct)  self.write_empty_line()  # Write serialization  self.gen_serializer_methods(struct)  if isinstance(struct, ast.Command): self.gen_op_msg_request_methods(struct)  # Write getters & setters  for field in struct.fields: if not field.ignore: if field.description: self.gen_description_comment(field.description) self.gen_getter(struct, field) if not struct.immutable and not field.chained_struct_field: self.gen_setter(field)  if struct.generate_comparison_operators: self.gen_comparison_operators_declarations(struct)  self.write_unindented_line(\'protected:\')  self.gen_protected_serializer_methods(struct)  # Write private validators  if [field for field in struct.fields if field.validator]: self.write_unindented_line(\'private:\') for field in struct.fields: if not field.ignore and not struct.immutable and \\ not field.chained_struct_field and field.validator: self.gen_validators(field)  self.write_unindented_line(\'private:\')  # Write command member variables  if isinstance(struct, ast.Command): self.gen_known_fields_declaration() self.write_empty_line() self.gen_op_msg_request_member(struct)  # Write member variables  for field in struct.fields: if not field.ignore and not field.chained_struct_field: self.gen_member(field)  # Write serializer member variables  # Note: we write these out second to ensure the bit fields can be packed by  # the compiler.  for field in struct.fields: if _is_required_serializer_field(field): self.gen_serializer_member(field) self.write_empty_line() for scp in spec.server_parameters: if scp.cpp_class is None:  self._gen_exported_constexpr(scp.name, \'Default\', scp.default, scp.condition) self._gen_extern_declaration(scp.cpp_vartype, scp.cpp_varname, scp.condition) self.gen_server_parameter_class(scp) if spec.configs: for opt in spec.configs:  self._gen_exported_constexpr(opt.name, \'Default\', opt.default, opt.condition)  self._gen_extern_declaration(opt.cpp_vartype, opt.cpp_varname, opt.condition) self._gen_config_function_declaration(spec)

buildscripts/idl/idl/generator.py运行之后,python运行结果在对应的文件夹\\build\\opt\\mongo\\db\\commands

create.idl生成了create_gen.h和create_gen.cpp,C++编译之后create_gen.obj文件。

\\build\\opt\\mongo\\db\\commands\\create_gen.h,createCollection命令中的各个参数在下面文件都能看到,参数的get和set方法,代码:

namespace mongo {/** * Parser for the \'create\' Command */class CreateCommand {public: ... explicit CreateCommand(const NamespaceString nss); static CreateCommand parse(const IDLParserErrorContext& ctxt, const BSONObj& bsonObject); static CreateCommand parse(const IDLParserErrorContext& ctxt, const OpMsgRequest& request); void serialize(const BSONObj& commandPassthroughFields, BSONObjBuilder* builder) const; OpMsgRequest serialize(const BSONObj& commandPassthroughFields) const; BSONObj toBSON(const BSONObj& commandPassthroughFields) const; const NamespaceString& getNamespace() const { return _nss; } bool getCapped() const { return _capped; } void setCapped(bool value) & { _capped = std::move(value); } const boost::optional getAutoIndexId() const& { return _autoIndexId; } void getAutoIndexId() && = delete; void setAutoIndexId(boost::optional value) & { _autoIndexId = std::move(value); }...

\\build\\opt\\mongo\\db\\commands\\create_gen.cpp,CreateCommand解析方法,createCollection命令解析成CreateCommand对象,代码:

namespace mongo {...CreateCommand::CreateCommand(const NamespaceString nss) : _nss(std::move(nss)), _dbName(nss.db().toString()), _hasDbName(true) { // Used for initialization only}CreateCommand CreateCommand::parse(const IDLParserErrorContext& ctxt, const BSONObj& bsonObject) { NamespaceString localNS; CreateCommand object(localNS); object.parseProtected(ctxt, bsonObject); return object;}CreateCommand CreateCommand::parse(const IDLParserErrorContext& ctxt, const OpMsgRequest& request) { NamespaceString localNS; CreateCommand object(localNS); object.parseProtected(ctxt, request); return object;}

总结:buildscripts/idl/idl/generator.py把create.idl转成create_gen.cpp和create_gen.h,再编译成create_gen.obj,CreateCommand对象封装命令createCollection。