Configurations

class llm_analysis.config.DtypeConfig(name='w16a16e16', weight_bits=16, activation_bits=16, embedding_bits=16)[source]

Bases: object

class llm_analysis.config.EnhancedJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

default(o)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class llm_analysis.config.GPUConfig(name, mem_per_GPU_in_GB, hbm_bandwidth_in_GB_per_sec, intra_node_bandwidth_in_GB_per_sec, intra_node_min_message_latency, peak_fp16_TFLOPS, peak_i8_TFLOPS=None, peak_i4_TFLOPS=None, inter_node_bandwidth_in_GB_per_sec=200)[source]

Bases: object

class llm_analysis.config.ModelConfig(name, num_layers, n_head, hidden_dim, vocab_size, max_seq_len=None, num_key_value_heads=None, num_key_value_groups=None, ffn_embed_dim=None, expansion_ratio=None, model_type=None, moe_num_experts=1, moe_top_k=1, mlp_gated_linear_units=False)[source]

Bases: object

class llm_analysis.config.ParallelismConfig(tp_size=1, pp_size=1, dp_size=1, ep_size=1, sp_size=None)[source]

Bases: object

llm_analysis.config.dump_configs(configs, config_dir_name)[source]

Dump configs to json files under config_dir_name.

Parameters:
  • configs (dict) – a dict of configs

  • config_dir_name (str) – the name of the output directory

Return type:

None

llm_analysis.config.dump_hf_model_configs_by_type_and_task(model_type='opt', task=None, min_downloads=10000, top_k=6, config_dir_name='model_configs')[source]

Dump model configs from HuggingFace by type and task to config_dir_name

Parameters:
  • model_type (str, optional) – model type, e.g., gpt, llama, opt, bloom. Defaults to “opt”.

  • task (str, optional) – model task, e.g., text-generation, fill-mask. Defaults to “text-generation”.

  • min_downloads (int, optional) – minimal number of downloads to filter the models. Defaults to 10000.

  • top_k (int, optional) – _description_. Defaults to 6.

  • config_dir_name (str, optional) – _description_. Defaults to MODEL_CONFIG_DIR_NAME.

Return type:

None

llm_analysis.config.dump_model_config_by_name(name, config_dir_name='model_configs')[source]

Dump a model config from either the populated model_configs or Hugging Face by name to config_dir_name

Parameters:
  • name (str) – model name, e,g., gpt2, facebook/opt-1.3b, decapoda-research/llama-7b-hf, etc.

  • config_dir_name (str, optional) – _description_. Defaults to MODEL_CONFIG_DIR_NAME.

Return type:

None

llm_analysis.config.get_dtype_config_by_name(name)[source]

Get data type config from the populated mapping by name.

Return type:

DtypeConfig

llm_analysis.config.get_gpu_config_by_name(name)[source]

Get gpu config from the populated mapping by name.

Return type:

GPUConfig

llm_analysis.config.get_hf_models_by_type_and_task(model_type='opt', task=None, min_downloads=10000, top_k=6, full_info=False)[source]

Get a HuggingFace model name list by model type and task, filtered by popularity (minimal number of downloads)

Parameters:
  • model_type (str, optional) – model type, e.g., gpt, llama, opt, bloom. Defaults to “opt”.

  • task (str, optional) – model task, e.g., text-generation, fill-mask. Defaults to “text-generation”.

  • min_downloads (int, optional) – minimal number of downloads to filter the models. Defaults to 10000.

  • top_k (int, optional) – _description_. Defaults to 6.

  • full_info (bool, optional) – whether to return full model information, if False, just return the list of model names. Defaults to False.

Returns:

a list of HuggingFace model information

Return type:

list

llm_analysis.config.get_model_config_by_name(name_or_path)[source]

Get model config from the populated mapping by name, or from model config json file path, if not found from the previous methods, try to get it from HuggingFace.

Return type:

ModelConfig

llm_analysis.config.get_model_config_from_hf(name)[source]

Get model config from HuggingFace transformers library AutoConfig; if the model does not exist, try updating the transformers library.

Parameters:

name (str) – the model id of a pretrained model configuration hosted inside a model repo on huggingface.co

Returns:

a dataclass for llm-analysis model config

Return type:

ModelConfig

llm_analysis.config.list_dtype_configs()[source]

List all predefined data type configs.

Return type:

None

llm_analysis.config.list_gpu_configs()[source]

List all predefined gpu configs.

Return type:

None

llm_analysis.config.list_model_configs()[source]

List all predefined model configs.

Return type:

None

llm_analysis.config.populate_model_and_gpu_configs()[source]

Populate model, gpu, and data type configs from the pre-defined json files.

Return type:

None

llm_analysis.config.read_configs(config_dir_name, type='model')[source]

Read configs from a directory.

Return type:

dict