Building Your
Own AI Instance
Brian Pichman
Description
• Join Brian Pichman from the Evolve Project in an enlightening session focusing on the
creation of a building your own AI chatbot. This advanced track delves into the practical
aspects of utilizing the OpenAI API alongside other innovative software products.
Participants will gain invaluable insights into the processes and technologies involved
inbuilding a custom AI instance. This track is ideal for those seeking adeeper
understanding of AI integration and personalization in the realm. of conversational AI.
Key Tools
• We will be spending most of our time exploring Open AI API’s – https://platform.openai.com/
• We will also experiment with two of my favorite tools
• https://www.chatbase.co/
• https://afforai.com/
But First … Rules Of The
Road
Things To Think About
• Privacy:
• If you’re pulling from a data model: what data does it access, what gets stored, and how is it managed.
• Does your policy cover what gets fed into the models?
• Wisdom Tips
• Never upload something that has PII information or is confidential.
• Ask for permission to upload publicly available information (just like you would cite a source or ask for
permission if writing a paper)
• Usage of Bot
• How do you correct misinformation given from your bot? What if someone uploads something they shouldn’t?
What policies are in place for data retention?
Privacy concerns
We also use data from versions of ChatGPT and
DALL·E for individuals. Data from ChatGPT Team,
ChatGPT Enterprise, and the API Platform (after
March 1, 2023) isn't used for training our models.
We will not train our models on any Materials that
are not publicly available, except in two
circumstances: If you provide Feedback to us and
if your Materials are flagged for trust and safety
review
Gemini Apps use your past conversations, location,
and related info to generate a response. Google
uses conversations (as well as feedback and
related data) from Gemini Apps users to improve
Google products (such as the generative machine-
learning models that power Gemini Apps). Human
review is a necessary step of the model
improvement process. Through their review, rating,
and rewrites, humans help enable quality
improvements of generative machine-learning
models like the ones that power Gemini Apps.
https://openai.com/enterprise-privacy
https://support.anthropic.com/en/articles/7996885-how-do-you-
use-personal-data-in-model-training
https://support.google.com/gemini/answer/13594961?hl=e
n#zippy=%2Cwhat-data-is-collected-how-is-it-used
william.gunn@gmail.com
Risk management plans
Amazon, Anthropic, Google, Meta, Microsoft, and OpenAI agreed to test systems before release, collaborate with government and
academia, invest in cybersecurity, build watermarking systems, publicly disclose capabilities, and research bias and privacy issues.
• Safety Team for existing models
• Preparedness Framework for
frontier models
• Assess and evaluate
capabilities in persuasion,
cybersecurity, CBRN threats,
autonomous replication
• Superalignment for AGI/ASI
• Use AI to help align AGI
Responsible Scaling Policy
The plan outlines safety levels, ASL 1-
5 and details plans to detect
capabilities that have advanced to the
next level and to decide whether and
how the model should be deployed.
Google mostly talks about
cybersecurity and their research.
Microsoft has a template for
individual teams to design their
own plans.
Amazon has a set of tools to
allow model builders to specify
topics to be avoided and to
understand how a dataset might
lead to biased or unexpected
outputs.
william.gunn@gmail.com
Trained Data with Personalities
1.Temperature:
1.This controls the randomness of the model's responses.
2.Lower temperature (e.g., 0.0) means more predictable and
conservative responses.
3.Higher temperature (e.g., 1.0) makes responses more
creative and less predictable.
2.Frequency Penalty:
1.This reduces the model's likelihood of repeating the same
line of thought or content.
2.A higher frequency penalty discourages repetition.
3.A value of 0 means no penalty for repetition.
3.Presence Penalty:
1.This affects the likelihood of introducing new concepts or
topics in the response.
2.A higher presence penalty encourages the model to talk
about new things.
3.A value of 0 means no emphasis on new content.
Providing Context: You can include specific instructions
or context in your prompt. For example, "Act like a friendly
and knowledgeable assistant" sets a tone and character
for the model's responses.
Training Data
Afforai
https://huggingface.co/
Questions
• bpichman@evolveproject.org

Building Your Own AI Instance (TBLC AI )

  • 1.
    Building Your Own AIInstance Brian Pichman
  • 2.
    Description • Join BrianPichman from the Evolve Project in an enlightening session focusing on the creation of a building your own AI chatbot. This advanced track delves into the practical aspects of utilizing the OpenAI API alongside other innovative software products. Participants will gain invaluable insights into the processes and technologies involved inbuilding a custom AI instance. This track is ideal for those seeking adeeper understanding of AI integration and personalization in the realm. of conversational AI.
  • 3.
    Key Tools • Wewill be spending most of our time exploring Open AI API’s – https://platform.openai.com/ • We will also experiment with two of my favorite tools • https://www.chatbase.co/ • https://afforai.com/
  • 4.
    But First …Rules Of The Road
  • 5.
    Things To ThinkAbout • Privacy: • If you’re pulling from a data model: what data does it access, what gets stored, and how is it managed. • Does your policy cover what gets fed into the models? • Wisdom Tips • Never upload something that has PII information or is confidential. • Ask for permission to upload publicly available information (just like you would cite a source or ask for permission if writing a paper) • Usage of Bot • How do you correct misinformation given from your bot? What if someone uploads something they shouldn’t? What policies are in place for data retention?
  • 6.
    Privacy concerns We alsouse data from versions of ChatGPT and DALL·E for individuals. Data from ChatGPT Team, ChatGPT Enterprise, and the API Platform (after March 1, 2023) isn't used for training our models. We will not train our models on any Materials that are not publicly available, except in two circumstances: If you provide Feedback to us and if your Materials are flagged for trust and safety review Gemini Apps use your past conversations, location, and related info to generate a response. Google uses conversations (as well as feedback and related data) from Gemini Apps users to improve Google products (such as the generative machine- learning models that power Gemini Apps). Human review is a necessary step of the model improvement process. Through their review, rating, and rewrites, humans help enable quality improvements of generative machine-learning models like the ones that power Gemini Apps. https://openai.com/enterprise-privacy https://support.anthropic.com/en/articles/7996885-how-do-you- use-personal-data-in-model-training https://support.google.com/gemini/answer/13594961?hl=e n#zippy=%2Cwhat-data-is-collected-how-is-it-used [email protected]
  • 7.
    Risk management plans Amazon,Anthropic, Google, Meta, Microsoft, and OpenAI agreed to test systems before release, collaborate with government and academia, invest in cybersecurity, build watermarking systems, publicly disclose capabilities, and research bias and privacy issues. • Safety Team for existing models • Preparedness Framework for frontier models • Assess and evaluate capabilities in persuasion, cybersecurity, CBRN threats, autonomous replication • Superalignment for AGI/ASI • Use AI to help align AGI Responsible Scaling Policy The plan outlines safety levels, ASL 1- 5 and details plans to detect capabilities that have advanced to the next level and to decide whether and how the model should be deployed. Google mostly talks about cybersecurity and their research. Microsoft has a template for individual teams to design their own plans. Amazon has a set of tools to allow model builders to specify topics to be avoided and to understand how a dataset might lead to biased or unexpected outputs. [email protected]
  • 9.
    Trained Data withPersonalities 1.Temperature: 1.This controls the randomness of the model's responses. 2.Lower temperature (e.g., 0.0) means more predictable and conservative responses. 3.Higher temperature (e.g., 1.0) makes responses more creative and less predictable. 2.Frequency Penalty: 1.This reduces the model's likelihood of repeating the same line of thought or content. 2.A higher frequency penalty discourages repetition. 3.A value of 0 means no penalty for repetition. 3.Presence Penalty: 1.This affects the likelihood of introducing new concepts or topics in the response. 2.A higher presence penalty encourages the model to talk about new things. 3.A value of 0 means no emphasis on new content. Providing Context: You can include specific instructions or context in your prompt. For example, "Act like a friendly and knowledgeable assistant" sets a tone and character for the model's responses.
  • 10.
  • 11.
  • 12.
  • 13.

Editor's Notes

  • #11 Great at web scraping
  • #12 Upload data or even survey results to analyze