关于AI安全技术的精彩列表
Awesome List on AI for Security

原始链接: https://github.com/AmanPriyanshu/Awesome-AI-For-Security

这份精选列表重点介绍了 AI 在网络安全领域日益增长的作用,重点关注大型语言模型 (LLM)、智能体和多模态系统。它包含了像 Foundation-Sec-8B 和 Llama-Primus 这样的专业安全模型,它们擅长网络威胁情报和推理。Primus-FineWeb 和 Primus-Reasoning 等资源有助于对 AI 进行安全任务的训练和微调。 AutoPatchBench 和 SecLLMHolmes 等框架为安全领域的 AI 系统提供了评估标准,包括 CTI-Bench 和 SECURE 等基准测试。研究论文探讨了 AI 在主动安全、漏洞检测和工业控制系统安全中的应用。 像 DeepFool、Counterfit 和 garak 这样的工具能够对 ML 系统进行安全评估和漏洞探测。HackingBuddyGPT 和 Agentic Radar 等 AI 智能体可以自动化安全扫描和渗透测试。这份列表鼓励贡献,并促进 AI 在网络安全领域的负责任使用。

Hacker News 最新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 关于安全领域人工智能的优秀列表 (github.com/amanpriyanshu) 12 分,来自 IEatPrompts,1 天前 | 隐藏 | 过去 | 收藏 | 讨论 加入我们,参加 6 月 16-17 日在旧金山举办的人工智能创业学校! 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系我们 搜索:
相关文章

原文

A curated list of tools, papers, and datasets for applying AI to cybersecurity tasks. This list primarily focuses on modern AI technologies like Large Language Models (LLMs), Agents, and Multi-Modal systems and their applications in security operations.

Other collections and lists that may be of interest.

AI models specialized for security applications and scenarios.

Specialized Security Models

  • Foundation-Sec-8B - Recent 8B parameter security-specialized model outperforming Llama 3.1 8B by +3.25% on CTI-MCQA and +8.83% on CTI-RCM, rivaling 70B models with 10x fewer parameters.
  • Llama-Primus-Base - Foundation model with cybersecurity-specific pretraining on proprietary corpus.
  • Llama-Primus-Merged - Combined model through pretraining and instruction fine-tuning.
  • Llama-Primus-Reasoning - Reasoning-specialized model enhancing security certification through expert reasoning patterns.

Resources designed for training and fine-tuning AI systems on security-related tasks.

  • Primus-FineWeb - Filtered cybersecurity corpus (2.57B tokens) derived from FineWeb using classifier-based selection.

IFT & Capability Datasets

  • Primus-Reasoning - Cybersecurity reasoning tasks with expert-generated reasoning steps and reflection processes.
  • Primus-Instruct - Expert-curated cybersecurity scenario instructions with GPT-4o generated responses spanning diverse tasks.

This section covers frameworks and methodologies for evaluating AI systems within security contexts.

  • AutoPatchBench - Benchmark for automated repair of fuzzing-detected vulnerabilities, pioneering evaluation standards.
  • SecLLMHolmes - Automated framework for systematic LLM vulnerability detection evaluation across multiple dimensions.
  • CTI-Bench - Benchmark suite for evaluating LLMs on cyber threat intelligence tasks.
  • SECURE - Practical cybersecurity scenario dataset focusing on extraction, understanding, and reasoning capabilities.
  • NYU CTF Bench - Dockerized CTF challenges repository enabling automated LLM agent interaction across categories.

General Security Knowledge

  • CyberSecEval 4 - Comprehensive benchmark suite for assessing LLM cybersecurity vulnerabilities with multi-vendor evaluations.
  • SecBench - Largest comprehensive benchmark dataset distinguishing between knowledge and reasoning questions.
  • MMLU Computer Security - Standard benchmark with dedicated computer security evaluation subset for general LLMs.
  • MMLU Security Studies - General benchmark's security studies subset providing broader security knowledge assessment.

Academic and industry research on AI applications in security.

  • Foundation-Sec Technical Report - Detailed methodology for domain-adaptation of Llama-3.1 for cybersecurity applications.
  • Primus Paper - First open-source cybersecurity dataset collection addressing critical pretraining corpus shortage.

Benchmarking & Evaluations

  • SecBench Paper - Multi-dimensional benchmark dataset with unprecedented scale for LLM cybersecurity evaluation.
  • NYU CTF Bench Paper - First scalable benchmark focusing on offensive security through CTF challenges.
  • SECURE Paper - Industry-focused benchmark targeting Industrial Control System security knowledge evaluation.
  • CyberMetric Paper - RAG-based cybersecurity benchmark with human-validated questions across diverse knowledge areas.
  • SecLLMHolmes Paper - Comprehensive analysis revealing significant non-robustness in LLM vulnerability identification capabilities.
  • LLM Offensive Security Benchmarking - Analysis of evaluation methodologies for LLM-driven offensive security tools with recommendations.
  • OffsecML Playbook - Comprehensive collection of offensive and adversarial techniques with practical demonstrations.
  • MCP-Security-Checklist - Comprehensive security checklist for MCP-based AI tools by SlowMist.

Software tools that implement AI for security applications.

  • DeepFool - Simple yet accurate method for generating adversarial examples against deep neural networks.
  • Counterfit - Automation layer for comprehensive ML system security assessment across multiple attack vectors.
  • Charcuterie - Collection of code execution techniques targeting ML libraries for security evaluation.
  • garak - Specialized security probing tool designed specifically for LLM vulnerability assessment.
  • Snaike-MLFlow - MLflow-focused red team toolsuite for attacking ML pipelines and infrastructure.
  • MCP-Scan - Security scanning tool specifically designed for Model Context Protocol servers.
  • Malware Env for OpenAI Gym - Reinforcement learning environment enabling malware manipulation for AV bypass learning.
  • Deep-pwning - Framework for assessing ML model robustness against adversarial attacks through systematic evaluation.

AI systems designed to perform security-related tasks with varying degrees of autonomy.

  • HackingBuddyGPT - Autonomous pentesting agent with corresponding benchmark dataset for standardized evaluation.
  • Agentic Radar - Open-source CLI security scanner for agentic workflows with automated detection.
  • HackGPT - LLM-powered tool designed specifically for offensive security and ethical hacking.
  • agentic_security - LLM vulnerability scanner specializing in agentic systems and workflows.

Contributions welcome! Read the contribution guidelines first.

CC0

联系我们 contact @ memedata.com