当前位置: 首页> 科技> 名企 > 大庆疫情最新消息_html5开发_网络推广是什么工作内容_百度热搜风云榜

大庆疫情最新消息_html5开发_网络推广是什么工作内容_百度热搜风云榜

时间:2025/8/16 14:40:52来源:https://blog.csdn.net/CY19980216/article/details/142441347 浏览次数:0次
大庆疫情最新消息_html5开发_网络推广是什么工作内容_百度热搜风云榜

序言

今天作为开篇是极好的,因此会长一些。其中一个原因是,今天可以算是为下半年所有比赛拉开帷幕(其实我到现在一场个人比赛都没报上)。

最万众瞩目的自然是衡水湖马拉松,作为著名的PB赛道,沿湖一圈,几乎没有任何爬升,许多高手都把PB压在了这一场比赛上,赛前甚至有预测将会有2-4人打破国家纪录。

现实是国内第一丰配友2小时10分11秒,与上半年何杰创造的206相去甚远,即便是国际第一的老黑也只是快了不到20秒,何杰本人也只跑了232(据说是给别人做私兔),顺子和李芷萱退赛,女子方面就没有任何看点了。其余各路高手大多未能如愿,魔都陈龙,经过数月高原训练,梦想达标健将,最终233铩羽而归,而何杰也处于非赛季232,不过牟振华(半个校友)跑出惊人的218,一跃达标健将。

身边的熟人,大多没有跑好,不过小严跑出249,均配4分整(上半年PB302,这次算是大幅PB);Jai哥253,他说自己没认真跑,给不少人拍了视频,肯定是中途感觉身体状态不足以PB(目前PB是248),提前收手,因为对于他这样的严肃跑者,还是在这么关键的比赛中,如果能PB一定是不会放过的。

究其根本,还是气温高了一些。另外,练得好,不如休得好,说实话,小严这次跑进250挺刺激我,毕竟在129训练时也算是跟他55开,他的训练模式也跟我很像,大多是节奏拉练和强度间歇,月跑量200K左右。所以,我想是否也该试着冲一冲250,但是这对于首马来说太冒险了,担心如果这么激进,或许最终连破三都不得。

虞山那边,五哥和五嫂分列35km组的男子亚军和女子冠军(406和430,不过这对于五哥来说肯定是没用力,要知道去年柴古的55km组,他可是跑出惊人的536,平均每小时10km的恐怖速度),真是模范夫妇,去年港百两人也都是前十。芹菜女子第17(553),不算很快,SXY(748)比军师(904)居然快了有一个多小时,就事论事,长距离的耐力女性本身确实要强于男性的,就像这次20km组女子第一(232)居然比男子第一(230)还要快,不夸张地说,我去参加都能轻松夺冠。

另外今天的515的卡位接力赛,嘉伟两组5000米,第一组17分41秒,第二组18分15秒,在今天这种天气下能连续跑出两段这样的成绩,说明他最近状态依然保持得很好,他最近很忙,其实算是他推了高百队长的位置,我才得以接任,否则是轮不到我带队的。这么看,下周末的耐克精英接力赛我是压力山大,嘉伟打头阵,我压轴收尾,可不能太拖他后腿,其实我也不知道自己现在5000米到底能跑到什么水平,或许18分半,或许能跑进18分也说不定。

我不知道,就像我也不知道今年的尾声上,还能否完成年初时的愿望。想要在高百总决赛把16km跑进1小时,以及首马破三,乃至250,很难,但并非不可能。我已经等了太久,也没有更多的时间再去等待。

Last dance, I pray


文章目录

  • 序言
  • 20240922


20240922

easyqa(Extractive + Genertive + MultipleChoice × Dataset + Model):

Dataset

# -*- coding: utf-8 -*- 
# @author : caoyang
# @email: caoyang@stu.sufe.edu.cnimport os
import torch
import loggingfrom src.base import BaseClassclass BaseDataset(BaseClass):dataset_name = Nonechecked_data_dirs = []batch_data_keys = []def __init__(self, data_dir, **kwargs):super(BaseDataset, self).__init__(**kwargs)self.data_dir = data_dirself.check_data_dir()@classmethoddef generate_model_inputs(cls, batch, tokenizer, **kwargs):raise NotImplementedError()# Generator to yield batch datadef yield_batch(self, **kwargs):raise NotImplementedError()# Check files and directories of datasetsdef check_data_dir(self):logging.info(f"Check data directory: {self.data_dir}")if self.checked_data_dirs:for checked_data_dir in self.checked_data_dirs:if os.path.exists(os.path.join(self.data_dir, checked_data_dir)):logging.info(f"√ {checked_data_dir}")else:logging.warning(f"× {checked_data_dir}")else:logging.warning("- Nothing to check!")# Check data keys in yield batch# @param batch: @yield of function `yield_batch`def check_batch_data_keys(self, batch):for key in self.batch_data_keys:assert key in batch[0], f"{key} not found in yield batch"class ExtractiveDataset(BaseDataset):dataset_name = "Extractive"batch_data_keys = ["context",	# List[Tuple[Str, List[Str]]], i.e. List of [title, article[sentence]]"question",	# Str"answers",	# List[Str]"answer_starts",	# List[Int]"answer_ends",	# List[Int]]def __init__(self, data_dir, **kwargs):super(ExtractiveDataset, self).__init__(data_dir, **kwargs)# Generate inputs for different models# @param batch: @yield of function `yield_batch`# @param tokenizer: Tokenizer object# @param model_name: See `model_name` of CLASS defined in `src.models.extractive`	@classmethoddef generate_model_inputs(cls,batch,tokenizer,model_name,**kwargs,):if model_name == "deepset/roberta-base-squad2":# Unpack keyword argumentsmax_length = kwargs.get("max_length", 512)# Generate batch inputsbatch_inputs = list()contexts = list()questions = list()for data in batch:context = str()for title, sentences in data["context"]:# context += title + '\n'context += '\n'.join(sentences) + '\n'contexts.append(context)questions.append(data["question"])# Note that here must be question_first, this is determined by `tokenizer.padding_side` ("right" or "left", default "right")# See `QuestionAnsweringPipeline.preprocess` in ./site-packages/transformers/pipelines/question_answering.py for detailsmodel_inputs = tokenizer(questions,contexts,add_special_tokens = True,max_length = max_length,padding = "max_length",truncation = True,return_overflowing_tokens = False,return_tensors = "pt",) 	# Dict[input_ids: Tensor(batch_size, max_length),#	   attention_mask: Tensor(batch_size, max_length)]else:raise NotImplementedError(model_name)return model_inputsclass GenerativeDataset(BaseDataset):dataset_name = "Generative"batch_data_keys = ["context",	# List[Tuple[Str, List[Str]]], i.e. List of [title, article[sentence]]"question",	# Str"answers",	# List[Str]]def __init__(self, data_dir, **kwargs):super(GenerativeDataset, self).__init__(data_dir, **kwargs)# Generate inputs for different models# @param batch: @yield of function `yield_batch`# @param tokenizer: Tokenizer object# @param model_name: See `model_name` of CLASS defined in `src.models.generative`	@classmethoddef generate_model_inputs(cls,batch,tokenizer,model_name,**kwargs,):NotImplementedmodel_inputs = Nonereturn model_inputs			class MultipleChoiceDataset(BaseDataset):dataset_name = "Multiple-choice"batch_data_keys = ["article",	# Str, usually"question",	# Str"options",	# List[Str]"answer",	# Int]def __init__(self, data_dir, **kwargs):super(MultipleChoiceDataset, self).__init__(data_dir, **kwargs)# Generate inputs for different models# @param batch: @yield of function `yield_batch`# @param tokenizer: Tokenizer object# @param model_name: See `model_name` of CLASS defined in `src.models.multiple_choice`@classmethoddef generate_model_inputs(cls,batch,tokenizer,model_name,**kwargs,):if model_name == "LIAMF-USP/roberta-large-finetuned-race":# Unpack keyword argumentsmax_length = kwargs.get("max_length", 512)# Generate batch inputsbatch_inputs = list()for data in batch:# Unpack dataarticle = data["article"]question = data["question"]option = data["options"]flag = question.find('_') == -1choice_inputs = list()for choice in option:question_choice = question + ' ' + choice if flag else question.replace('_', choice)inputs = tokenizer(article,question_choice,add_special_tokens = True,max_length = max_length,padding = "max_length",truncation = True,return_overflowing_tokens = False,return_tensors = None,	# return list instead of pytorch tensor, for concatenation)	# Dict[input_ids: List(max_length, ),#	   attention_mask: List(max_length, )]choice_inputs.append(inputs)batch_inputs.append(choice_inputs)# InputIds and AttentionMaskinput_ids = torch.LongTensor([[inputs["input_ids"] for inputs in choice_inputs] for choice_inputs in batch_inputs])attention_mask = torch.LongTensor([[inputs["attention_mask"] for inputs in choice_inputs] for choice_inputs in batch_inputs])model_inputs = {"input_ids": input_ids,	# (batch_size, n_option, max_length)"attention_mask": attention_mask,	# (batch_size, n_option, max_length)}elif model_name == "potsawee/longformer-large-4096-answering-race":# Unpack keyword argumentsmax_length = kwargs["max_length"]# Generate batch inputsbatch_inputs = list()for data in batch:# Unpack dataarticle = data["article"]question = data["question"]option = data["options"]article_question = [f"{question} {tokenizer.bos_token} article"] * 4# Tokenizationinputs = tokenizer(article_question,option,max_length = max_length,padding = "max_length",truncation = True,return_tensors = "pt",) 	# Dict[input_ids: Tensor(n_option, max_length),#	   attention_mask: Tensor(n_option, max_length)]batch_inputs.append(inputs)# InputIds and AttentionMaskinput_ids = torch.cat([inputs["input_ids"].unsqueeze(0) for inputs in batch_inputs], axis=0)attention_mask = torch.cat([inputs["attention_mask"].unsqueeze(0) for inputs in batch_inputs], axis=0)model_inputs = {"input_ids": input_ids,	# (batch_size, n_option, max_length)"attention_mask": attention_mask,	# (batch_size, n_option, max_length)}else:raise NotImplementedError(model_name)return model_inputs

Model

# -*- coding: utf-8 -*- 
# @author : caoyang
# @email: caoyang@stu.sufe.edu.cnimport torch
import string
import loggingfrom src.base import BaseClass
from src.datasets import (ExtractiveDataset,GenerativeDataset,MultipleChoiceDataset,RaceDataset,DreamDataset,SquadDataset,HotpotqaDataset,MusiqueDataset,TriviaqaDataset)
from transformers import AutoTokenizer, AutoModelclass BaseModel(BaseClass):Tokenizer = AutoTokenizerModel = AutoModeldef __init__(self, model_path, device, **kwargs):super(BaseModel, self).__init__(**kwargs)self.model_path = model_pathself.device = device# Load model and tokenizerself.load_tokenizer()self.load_vocab()self.load_model()# Load tokenizerdef load_tokenizer(self):self.tokenizer = self.Tokenizer.from_pretrained(self.model_path)# Load pretrained modeldef load_model(self):self.model = self.Model.from_pretrained(self.model_path).to(self.device)# Load vocabulary (in format of Dict[id: token])def load_vocab(self):self.vocab = {token_id: token for token, token_id in self.tokenizer.get_vocab().items()}class ExtractiveModel(BaseModel):def __init__(self, model_path, device, **kwargs):super(ExtractiveModel, self).__init__(model_path, device, **kwargs)# @param batch: @yield in function `yield_batch` of Dataset object# @return batch_start_logits: FloatTensor(batch_size, max_length)# @return batch_end_logits: FloatTensor(batch_size, max_length)# @return batch_predicts: List[Str] with length batch_sizedef forward(self, batch, **kwargs):model_inputs = self.generate_model_inputs(batch, **kwargs)for key in model_inputs:model_inputs[key] = model_inputs[key].to(self.device)model_outputs = self.model(**model_inputs)# 2024/09/13 11:08:21# Note: Skip the first token <s> or [CLS] in most situationbatch_start_logits = model_outputs.start_logits[:, 1:]batch_end_logits = model_outputs.end_logits[:, 1:]batch_input_ids = model_inputs["input_ids"][:, 1:]del model_inputs, model_outputsbatch_size = batch_start_logits.size(0)batch_predicts = list()batch_input_tokens = list()for i in range(batch_size):start_index = batch_start_logits[i].argmax().item()end_index = batch_end_logits[i].argmax().item()input_ids = batch_input_ids[i]input_tokens = list(map(lambda _token_id: self.vocab[_token_id.item()], input_ids))predict_tokens = list()for index in range(start_index, end_index + 1):predict_tokens.append((index, self.vocab[input_ids[index].item()]))# predict_tokens.append(self.vocab[input_ids[index].item()])batch_predicts.append(predict_tokens)batch_input_tokens.append(input_tokens)return batch_start_logits, batch_end_logits, batch_predicts, batch_input_tokens# Generate model inputs# @param batch: @yield in function `yield_batch` of Dataset objectdef generate_model_inputs(self, batch, **kwargs):return ExtractiveDataset.generate_model_inputs(batch = batch,tokenizer = self.tokenizer,model_name = self.model_name,**kwargs,)# Use question-answering pipeline provided by transformers# See `QuestionAnsweringPipeline.preprocess` in ./site-packages/transformers/pipelines/question_answering.py for details# @param context: Str / List[Str] (batch)# @param question: Str / List[Str] (batch)# @return pipeline_outputs: Dict[score: Float, start: Int, end: Int, answer: Str]def easy_pipeline(self, context, question):# context = """Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny\'s Child. Managed by her father, Mathew Knowles, the group became one of the world\'s best-selling girl groups of all time. Their hiatus saw the release of Beyoncé\'s debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy"."""# question = """When did Beyonce start becoming popular?"""pipeline_inputs = {"context": context, "question": question}question_answering_pipeline = pipeline(task = "question-answering",model = self.model,tokenizer = tokenizer,)pipeline_outputs = question_answering_pipeline(pipeline_inputs)return pipeline_outputsclass GenerativeModel(BaseModel):def __init__(self, model_path, device, **kwargs):super(GenerativeModel, self).__init__(model_path, device, **kwargs)# @param batch: @yield in function `yield_batch` of Dataset object# @return batch_start_logits: FloatTensor(batch_size, max_length)# @return batch_end_logits: FloatTensor(batch_size, max_length)# @return batch_predicts: List[Str] with length batch_sizedef forward(self, batch, **kwargs):model_inputs = self.generate_model_inputs(batch, **kwargs)model_outputs = self.model(**model_inputs)# TODONotImplemented# Generate model inputs# @param batch: @yield in function `yield_batch` of Dataset objectdef generate_model_inputs(self, batch, **kwargs):return GenerativeDataset.generate_model_inputs(batch = batch,tokenizer = self.tokenizer,model_name = self.model_name,**kwargs,)class MultipleChoiceModel(BaseModel):def __init__(self, model_path, device, **kwargs):super(MultipleChoiceModel, self).__init__(model_path, device, **kwargs)# @param data: Dict[article(List[Str]), question(List[Str]), options(List[List[Str]])]# @return batch_logits: FloatTensor(batch_size, n_option)# @return batch_predicts: List[Str] (batch_size, )def forward(self, batch, **kwargs):model_inputs = self.generate_model_inputs(batch, **kwargs)for key in model_inputs:model_inputs[key] = model_inputs[key].to(self.device)model_outputs = self.model(**model_inputs)batch_logits = model_outputs.logitsdel model_inputs, model_outputsbatch_predicts = [torch.argmax(logits).item() for logits in batch_logits]return batch_logits, batch_predicts# Generate model inputs# @param batch: @yield in function `yield_batch` of Dataset object# @param max_length: Max length of input tokensdef generate_model_inputs(self, batch, **kwargs):return MultipleChoiceDataset.generate_model_inputs(batch = batch,tokenizer = self.tokenizer,model_name = self.model_name,**kwargs,)
关键字:大庆疫情最新消息_html5开发_网络推广是什么工作内容_百度热搜风云榜

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

责任编辑: