diff --git a/README_cn.md b/README_cn.md index d7ccc84b..45463d40 100644 --- a/README_cn.md +++ b/README_cn.md @@ -469,3 +469,699 @@ x = x + 1 # 这改变了变量x赋值的值 更常见的定义是,操作系统是计算机上始终运行的一个程序(通常称为内核),所有其他的都是应用程序。 操作系统可以从两个观点来看:资源管理器和扩展机。在资源管理器视图中,操作系统的任务是有效地管理系统的不同部分。在扩展机视图中,系统的工作是为用户提供比实际机器更方便使用的抽象。这些抽象包括进程、地址空间和文件。操作系统有着悠久的历史,从它们取代操作员到现代的多程序系统。重要的里程碑包括早期的批处理系统、多程序系统和个人计算机系统。由于操作系统与硬件紧密交互,一些计算机硬件的知识对于理解操作系统是有用的。计算机是由处理器、内存和I/O设备组成的。这些部分通过总线连接。所有操作系统都是基于进程、内存管理、I/O管理、文件系统和安全性的基本概念构建的。任何操作系统的核心是它可以处理的系统调用集。这些告诉我们操作系统做什么。 + +操作系统有着悠久的历史,从它们取代操作员到现代的多程序系统。重要的里程碑包括早期的批处理系统、多程序系统和个人计算机系统。由于操作系统与硬件紧密交互,一些计算机硬件的知识对于理解操作系统是有用的。计算机是由处理器、内存和I/O设备组成的。这些部分通过总线连接。所有操作系统都是基于进程、内存管理、I/O管理、文件系统和安全性的基本概念构建的。任何操作系统的核心是它可以处理的系统调用集。这些告诉我们操作系统做什么。 + +### 操作系统作为资源管理器 +操作系统管理着复杂系统的所有部分。现代计算机由处理器、内存、计时器、硬盘、鼠标、网络接口、打印机以及各种其他设备组成。 +从底层视角来看,操作系统的任务是为处理器、内存和I/O设备在各种程序间的有序和控制分配提供支持。 +现代操作系统允许多个程序在内存中同时运行。想象一下,如果某台计算机上运行的三个程序都试图同时在同一台打印机上打印他们的输出,结果将是一片混乱。操作系统可以通过在硬盘上缓冲所有目标为打印机的输出来引导潜在的混乱。 +当一个程序完成时,操作系统可以从它被存储的硬盘文件中复制其输出到打印机,同时,其他程序可以继续生成更多的输出,而不知道输出还没有到打印机(尚未)。 +当一台计算机(或网络)有多个用户时,需要更多地管理和保护内存、I/O设备和其他资源,否则用户可能会相互干扰。此外,用户经常需要共享硬件以及信息(文件、数据库等)。简而言之,这种对操作系统的视角认为,它的主要任务是跟踪哪些程序正在使用哪些资源,批准资源请求,记录使用情况,并调解来自不同程序和用户的冲突请求。 + +### 操作系统作为扩展机器 +在机器语言级别,大多数计算机的架构是原始且难以编程的,特别是对于输入/输出。为了更具体地说明这一点,考虑一下在大多数计算机上使用的现代SATA(Serial ATA)硬盘。程序员需要了解什么才能使用这个硬盘。 +从那时起,接口已经多次修订,比2007年时更为复杂。只有疯子程序员才会想在硬件级别处理这个硬盘。 +相反,一个名为硬盘驱动器的软件处理硬件,并提供一个接口来读写硬盘块,而无需深入了解具体细节。 +操作系统包含许多用于控制I/O设备的驱动程序。 +但是,即使这个级别对于大多数应用程序来说也太低了。因此,所有操作系统都提供了另一层用于使用硬盘的抽象:文件。 +使用这种抽象,程序可以创建、写入和读取文件,而无需处理硬件工作的混乱细节。 +这种抽象是管理所有这些复杂性的关键。好的抽象将一个几乎不可能的任务变成两个可管理的任务。第一个是定义和实现抽象。第二个是使用这些抽象来解决手头的问题。 + +### 操作系统的历史 +- **第一代(1945-55)**:在巴贝奇的灾难性努力之后,构建数字计算机的进展甚微,直到二战时期。在爱荷华州立大学,约翰·阿塔纳索夫教授和他的研究生克利福德·贝瑞创建了今天被认为是第一个运行的数字计算机。在同一时间,康拉德·祖斯在柏林使用电机继电器构建了Z3计算机。霍华德·艾肯在哈佛创建了Mark I,英格兰布莱切利公园的一个科学家团队创建了Colossus,威廉·莫奇和他的博士生J.普雷斯珀·埃克特在宾夕法尼亚大学于1944年创建了ENIAC。 + +- **第二代(1955-65)**:在1950年代中期发明晶体管极大地改变了情况。计算机的可靠性足以将其作为商品制造并销售给付费客户,他们会假设这些计算机将继续工作足够长的时间来进行一些有意义的工作。这些机器现在被称为大型机,它们被锁在巨大的、特别是空调调节的计算机房间里,有专业的操作团队来管理它们。只有大型企业、重要的政府实体或机构才能承担得起几百万美元的价格标签。 + +- **第三代(1965-80)**:与使用单个晶体管构建的第二代计算机相比,IBM 360是第一款使用(小规模)集成电路(ICs)的主要计算机系列。因此,它提供了显著的价格/性能优势。它立即走红,其他所有大型制造商很快就接受了一系列互操作计算机的概念。所有的软件,包括OS/360操作系统,原始设计中都应该与所有模型兼容。它必须在大型系统上运行,这些系统经常用于重型计算和天气预报,取代了7094,以及微小的系统,这些系统通常仅用于将卡传输到磁带,替代了1401。它必须能够很好地处理少量外设和大量外设的系统。它必须在专业和学术环境中都能运行。最重要的是,它必须对这些多种应用都有效。 + +- **第四代(1980-至今)**:随着LSI(大规模集成)电路的创建,个人计算机时代开始了,这些处理器在硅的平方厘米上拥有数千个晶体管。尽管个人计算机,最初被称为微型计算机,在架构上与PDP-11类的小型计算机没有显著变化,但他们在价格上确实有很大的不同。 + +- **第五代(1990-至今)**:自1940年代的侦探Dick Tracy在连环画中开始使用他的"双向无线电手表"进行通话以来,人们一直渴望拥有一个便携式通讯设备。1946年,真正的移动电话首次亮相,重量约为40公斤。1970年代,第一个真正的便携式电话登场,重量仅约一公斤,非常轻便。它被戏称为“砖头”,并在很短的时间内风靡全球。 + +### 操作系统的功能 +- **便利性**:操作系统使得计算机的使用更为便利。 +- **高效性**:操作系统可以使计算机系统的资源得到高效利用。 +- **演进能力**:操作系统应该被设计成可以在不影响服务的同时,有效地开发、测试和引入新的系统功能。 +- **吞吐率**:操作系统应被设计为可以提供最大的吞吐率(即单位时间内完成的任务数量)。 + +### 操作系统的主要功能 +- **资源管理**:当操作系统中发生并行访问时,也就是多个用户同时访问系统时,操作系统作为资源管理器,其职责是为用户提供硬件。这降低了系统的负载。 +- **进程管理**:它包括诸如进程调度和终止等各种任务。操作系统可以同时管理各种任务。这里发生的CPU调度指的是由许多用于调度的算法完成的所有任务。 +- **存储管理**:文件系统机制用于存储管理。NIFS、CFS、CIFS、NFS等是一些文件系统。所有数据都存储在硬盘的各个轨道上,所有这些都由存储管理器管理。它包括硬盘。 +- **内存管理**:指的是对主内存的管理。操作系统必须跟踪已使用了多少内存以及由谁使用。它必须决定哪个进程需要内存空间以及需要多少。操作系统还必须分配和释放内存空间。 +- **安全/隐私管理**:操作系统也通过使用密码提供隐私,以防止未经授权的应用程序访问程序或数据。例如,Windows使用**_Kerberos_**身份验证来防止对数据的未经授权的访问。 + +### 操作系统的类型 +- **大型机操作系统**: +在高端是为主机(那些仍然存在于主要公司数据中心的房间大小的计算机)设计的操作系统。这些计算机与个人计算机在I/O容量上有所不同。拥有1000个硬盘和数百万吉字节数据的主机并不少见;具有这些规格的个人计算机会令其同伴羡慕。主机也作为高端Web服务器、大型电子商务网站的服务器以及业务到业务交易的服务器正在进行一些复兴。 +大型机的操作系统主要侧重于同时处理许多作业,其中大部分需要大量的I/O。它们通常提供三种服务:批处理、事务处理和分时处理。 + +- **服务器操作系统**: +下一级是服务器操作系统。它们运行在服务器上,这些服务器可以是非常大的个人计算机、工作站,甚至是大型机。它们可以通过网络同时为多个用户服务,并允许用户共享硬件和软件资源。服务器可以提供打印服务、文件服务或Web服务。 +互联网服务提供商运行许多服务器机器来支持其客户,网站使用服务器存储网页并处理传入请求。 +典型的服务器操作系统有Solaris、FreeBSD、Linux和Windows Server 201x。 + +- **多处理器操作系统**: +获取大型计算能力的一种越来越常见的方式是将多个CPU连接到一个系统中。 +根据它们的连接方式和共享内容的具体情况,这些系统被称为并行计算机、多计算机或多处理器。 +它们需要特殊的操作系统,但这些操作系统通常是对服务器操作系统的变体,具有特殊的通信、连接和一致性功能。 + +- **个人计算机操作系统**: +下一个类别是个人计算机操作系统。现代的个人计算机操作系统都支持多程序设计,通常在启动时启动数十个程序。 +它们的任务是为单个用户提供良好的支持。它们被广泛用于文字处理、电子表格、游戏和互联网访问。常见的例子包括Linux、FreeBSD、Windows 7、Windows 8和Apple的OS X。个人计算机操作系统广为人知,可能几乎不需要介绍。 +许多人甚至不知道还存在其他类型的操作系统。 + +- **嵌入式操作系统**: +嵌入式系统运行在控制不通常被认为是计算机的设备的计算机上,并且不接受用户安装的软件。 +典型的例子包括微波炉、电视机、汽车、DVD录制机、传统电话和MP3播放器。区分嵌入式系统和手持设备的主要特性是嵌入式系统上永远不会运行不受信任的软件。 +你不能向你的微波炉下载新的应用程序——所有的软件都在ROM中。这意味着不需要在应用程序之间进行保护,简化了设计。像嵌入式Linux、QNX和VxWorks这样的系统在这个领域很受欢迎。 + +- **智能卡操作系统**: +最小的操作系统运行在信用卡大小的智能卡设备的CPU芯片上。它们在处理能力和内存约束方面具有非常严重的限制。 +一些设备通过插入读卡器中的接触点供电,但是无接触智能卡是感应供电的,极大地限制了它们可以做什么。有些只能处理单一功能,如电子支付,但其他一些可以处理多种功能。 +这些通常是专有系统。 +一些智能卡是Java导向的。这意味着智能卡上的ROM保存了Java虚拟机(JVM)的解释器。Java小程序(小程序)被下载到卡上,并由JVM解释器解释。 +其中一些卡可以同时处理多个Java小程序,从而导致多程序设计和需要对它们进行调度。当同时存在两个或更多小程序时,资源管理和保护也成为一个问题。 +这些问题必须由卡上存在的(通常极其原始的)操作系统来处理。 + +## [内存和存储](Memory%20and%20Storage/readme.md) + +### 内存 +内存是指计算机中允许短期数据访问的组件。你可以将此组件认为是DRAM或动态随机访问存储器。您的计算机通过访问存储在其短期存储器中的数据来执行许多操作。这样的操作包括编辑文档、加载应用程序和浏览互联网。您的系统的速度和性能取决于安装在计算机上的内存大小。 + +如果你有一个桌子和一个文件柜,那么桌子代表你的计算机的内存。您需要立即使用的物品被放在桌子上,以便轻松访问。然而,由于桌子的大小限制,不能在桌子上放太多东西。 + +### 存储 +相对于内存来说,存储是计算机中允许您长期存储和访问数据的组件。通常,存储以固态硬盘或硬盘驱动器的形式出现。存储永久地保存您的应用程序、操作系统和文件。计算机需要从存储系统中读取和写入信息,因此存储速度决定了您的系统启动、加载和访问已保存内容的速度。 + +虽然桌子代表计算机的内存,但文件柜代表你的计算机的存储。它保存需要保存和存储的物品,但不一定需要立即访问。文件柜的大小意味着它可以保存许多东西。 + +**内存和存储之间的一个重要区别**是,当计算机关闭时,内存会清空。另一方面,无论你关闭电脑多少次,存储都会保持不变。因此,在桌子和文件柜的比喻中,当你离开办公室时,桌子上留下的任何文件都会被扔掉。文件柜里的所有东西都会留下。 + +### 虚拟内存 +在计算机系统的核心是内存,这是运行程序和存储数据的空间。但是,当你运行的程序和你正在处理的数据超过计算机物理内存的容量时该怎么办呢?这就是虚拟内存的作用,它作为计算机内存的智能扩展,增强了其功能。 + +**虚拟内存的定义和目的:** + +虚拟内存是操作系统采用的一种内存管理技术,用来克服物理内存(RAM)的限制。它为软件应用程序创建了一种假象,让它们认为可以访问的内存量大于计算机上实际安装的内存量。本质上,它使程序能够使用超出计算机物理RAM限制的内存空间。 + +虚拟内存的主要目的是实现有效的多任务处理和执行更大的程序,同时保持系统的响应性。它通过在物理RAM和二级存储设备(如硬盘驱动器或SSD)之间创建无缝的交互来实现这一点。 + +**虚拟内存如何扩展可用的物理内存:** + +虚拟内存可以看作是连接计算机的RAM和二级存储(硬盘驱动器)的桥梁。当你运行一个程序时,它的部分内容会被加载到更快的物理内存(RAM)中。然而,并非程序的所有部分都会立即使用。 + +虚拟内存利用这种情况,将程序中没有被经常访问的部分从RAM移动到二级存储,为RAM中经常访问的部分创造出更多的空间。这个过程对用户和正在运行的程序是透明的。当再次需要移动的部分时,它们会被换回到RAM,而其他不太活跃的部分可能会被移动到二级存储。 + +这种动态地将数据交换进出物理内存的过程由操作系统管理。它允许程序运行,即使它们大于可用的RAM,因为操作系统聪明地决定了什么数据需要在RAM中以获得最佳性能。 + +总的来说,虚拟内存充当一个虚拟化层,通过临时在RAM和二级存储之间传输程序和数据的部分,扩展了可用的物理内存。这个过程确保了计算机可以同时处理更大的任务和众多的程序,同时保持高效的性能和响应能力。 + +## [文件系统](File%20System/readme.md) +在计算中,文件系统(通常缩写为fs)是操作系统用来控制数据如何存储和检索的方法和数据结构。没有文件系统,存储在存储介质中的数据将是一个大的数据体,无法知道一个数据结束和下一个数据开始的地方,或者在检索数据时无法知道任何数据的位置。通过将数据分成各个部分并给每个部分一个名字,数据可以很容易地被隔离和识别。从纸质数据管理系统的命名方式中取名,每一组数据都被称为一个“文件”。用来管理数据组及其名称的结构和逻辑规则被称为“文件系统”。 + +有许多种类型的文件系统,每种都有独特的结构和逻辑,有速度、灵活性、安全性、大小等属性。有些文件系统被设计用于特定的应用。例如,ISO 9660文件系统专门为光盘设计。 + +文件系统可以在使用各种媒体的许多类型的存储设备上使用。截至2019年,硬盘驱动器一直是关键的存储设备,预计在可预见的将来仍将如此。其他使用的媒体包括SSD、磁带和光盘。在某些情况下,如tmpfs,计算机的主内存(随机存取存储器,RAM)会创建一个临时文件系统供短期使用。 + +有些文件系统用于本地数据存储设备;其他提供通过网络协议(例如,NFS、SMB或9P客户端)进行文件访问。有些文件系统是“虚拟”的,这意味着提供的“文件”(称为虚拟文件)是根据请求计算的(如procfs和sysfs)或仅仅是映射到用作后备存储的不同文件系统。文件系统管理对文件内容和关于这些文件的元数据的访问。它负责安排存储空间;关于物理存储介质的可靠性、效率和调整是重要的设计考虑因素。 + +### 文件系统如何工作 +文件系统存储和组织数据,可以被认为是存储设备中所有数据的一种索引。这些设备可以包括硬盘驱动器、光驱和闪存驱动器。 + +文件系统规定了命名文件的约定,包括名称中的最大字符数、可以使用的字符,以及在某些系统中,文件名后缀的长度。在许多文件系统中,文件名不区分大小写。 + +除了文件本身,文件系统还包含元数据中的信息,如文件的大小、属性、位置和在目录中的层次结构。元数据还可以标识驱动器上可用存储的空闲块以及可用的空间有多少。 + +文件系统还包括一种格式,用于通过目录结构指定到文件的路径。文件放在目录中——或者在Windows操作系统中的文件夹中——或者在树结构中所需位置的子目录中。PC和移动操作系统有文件系统,其中文件被放在分层的树结构中。 + +在存储介质上创建文件和目录之前,应当放置分区。分区是硬盘或其他存储的一个区域,操作系统单独管理。一个文件系统包含在主分区中,一些操作系统允许在一个磁盘上有多个分区。在这种情况下,即使其中一个分区被破坏,其他分区的数据将是安全的。 + +### 文件系统的类型 +有几种类型的文件系统,它们都有不同的逻辑结构和属性,如速度和大小。文件系统的类型可以由操作系统和该操作系统的需求来决定。微软Windows,Mac OS X和Linux是最常见的PC操作系统。移动操作系统包括Apple iOS和Google Android。 + +主要的文件系统包括以下几种: + +- 文件分配表(FAT)由微软Windows操作系统支持。FAT被认为简单可靠,是模仿传统文件系统的。FAT是在1977年为软盘设计的,但后来被改造用于硬盘。尽管FAT的效率高,且与大多数当前的操作系统兼容,但它无法匹配更现代的文件系统的性能和可扩展性。 + +- 全局文件系统(GFS)是Linux操作系统的文件系统,它是一个共享磁盘文件系统。GFS提供对共享块存储的直接访问,可以作为本地文件系统使用。 + +- GFS2是一个更新版本,包含了原始GFS中没有的功能,如更新的元数据系统。在GNU通用公共许可证的条款下,GFS和GFS2文件系统都可以作为免费软件获得。 + +- 分层文件系统(HFS)是为Mac操作系统开发的。HFS也可以被称为Mac OS Standard,由Mac OS Extended接替。HFS最初在1985年为软盘和硬盘推出,取代了原始的Macintosh文件系统。它也可以用在CD-ROM上。 + +- NT文件系统——也被称为新技术文件系统(NTFS)——是从Windows NT 3.1操作系统开始的Windows产品的默认文件系统。与以前的FAT文件系统相比,NTFS在元数据支持、性能和磁盘空间使用方面有所改进。NTFS也在Linux操作系统中通过一个免费的、开源的NTFS驱动程序得到支持。Mac操作系统对NTFS有只读支持。 + +- 通用磁盘格式(UDF)是一个面向光盘和DVD的中立厂商的文件系统。UDF取代了ISO 9660文件系统,是DVD论坛选择的DVD视频和音频的官方文件系统。 + +## [云计算](Cloud%20Computing/Readme.md) +云计算是通过互联网访问信息和应用程序的能力。云计算允许用户从任何有互联网连接的地方访问应用程序和数据。 + +云计算是一种基于互联网的计算类型,可提供共享的计算机处理资源和数据,以满足计算机和其他设备的需求。 + +它是一种模型,可以使用户在任何地方,任何时候,方便地通过网络访问一组可配置的计算资源(如网络、服务器、存储、应用程序和服务),这些资源可以快速地被配置和释放,而无需大量的管理工作或服务提供商的交互。 + + ### 云计算的主要优势 + +云计算是企业思考IT资源的一种重大转变。以下是机构转向云计算服务的七个常见原因: + +成本 +云计算消除了购买硬件和软件,设置和运行现场数据中心的资本支出--服务器机架,全天候的电力供应和冷却,以及管理基础设施的IT专家。这一切都会快速积累。 + +速度 +大多数云计算服务是自助服务,并按需提供,所以即使是大量的计算资源也可以在几分钟内提供,通常只需要几次鼠标点击,这给企业带来了很大的灵活性,减轻了容量规划的压力。 + +全球范围 +云计算服务的好处包括弹性扩展的能力。在云语言中,这意味着提供适当数量的IT资源--例如,增加或减少计算能力、存储和带宽--在需要的时候和正确的地理位置。 + +生产力 +现场数据中心通常需要大量的“装架和堆叠”--硬件设置、软件补丁和其他耗时的IT管理任务。云计算消除了这些任务的需要,所以IT团队可以花时间实现更重要的商业目标。 + + +性能 +最大的云计算服务在全球范围内的安全数据中心网络上运行,这些数据中心定期升级为最新一代的快速高效的计算硬件。这相比于单个企业数据中心有几个优点,包括降低应用程序的网络延迟和更大的规模经济效益。 + +可靠性 +云计算使数据备份、灾难恢复和业务连续性变得更容易,也更便宜,因为数据可以在云服务提供商网络上的多个冗余站点进行镜像。 + +安全性 +许多云提供商提供一整套广泛的政策、技术和控制措施,这些措施加强了您的整体安全姿态,有助于保护您的数据、应用程序和基础设施免受潜在威胁。 + +### 云计算服务的类型 +- [基础设施即服务(IaaS)](Cloud%20Computing/Readme.md#infrastructure-as-a-service-iaas) +- [平台即服务(PaaS)](Cloud%20Computing/Readme.md#platform-as-a-service-paas) +- [软件即服务(SaaS)](Cloud%20Computing/Readme.md#software-as-a-service-saas) + + +## [机器学习]() +机器学习是教计算机学习的实践。这个概念使用模式识别以及其他形式的预测算法,对输入的数据进行判断。这个领域与人工智能和计算统计密切相关。 + +### 机器学习有三个子类别: + +### 监督机器学习 +在这种情况下,机器学习模型是用标注的数据集训练的,这允许模型随着时间的推移更准确地学习和成长。例如,一个算法会用人类标记的狗的照片和其他东西进行训练,机器会学习自己识别狗的照片的方法。监督机器学习是现在最常用的类型。 + +监督学习的实际应用 - +1. **生物信息学:** 生物信息学是研究个体如何保留生物知识的学问,如指纹、眼睛的纹理、耳垂等。现在的手机已经足够聪明,可以理解我们的生物数据,然后验证我们以提高系统的安全性。 +2. **语音识别:** 你只需要向程序传达你的声音,它就能识别你。最知名的实际设备是数字助手,如Google助手或Siri,它们只会通过你的声音来响应。 +3. **垃圾邮件检测:** 这个工具用来阻止发送虚假的或基于机器的消息。Gmail包含了一个学习了多种错误词汇的算法。Oneplus的消息应用程序会要求用户指定哪些词应该被禁止,关键词会在应用程序中阻止这样的消息。 +4. **视觉对象识别:** 这类算法一般用于定义某个对象。你用一个很大的训练接来来训练算法,该算法可以用这个数据集来识别新的对象。 + +### 无监督机器学习 +在无监督机器学习中,程序在未标记的数据中寻找模式。无监督机器学习可以找到人们没有明确寻找的模式或趋势。例如,一个无监督的机器学习程序可以查看在线销售数据,并识别出进行购买的不同类型的客户。 + +无监督学习的实际应用 +1. **聚类:** 聚类是将数据分组的过程。当我们不知道所有的簇的详细信息时,我们可以使用无监督学习来聚类。无监督学习被用于分析和组织没有预标记类或类属性的数据。聚类可以帮助公司更有效地处理他们的数据。 +假设你有一个YouTube频道。你可能有很多关于你订阅者的信息。如果你想找到相似的订阅者,你需要使用聚类技术。 +2. **可视化:** 制作图表、图片、图形、图表等来展示信息的过程被称为可视化。无监督机器学习可以用来实现这个策略。 +假设你是一个板球教练,有关于你的团队在比赛中的表现的信息。你可能希望快速找到所有的比赛统计数据。你可以将未标记和复杂的数据传给一个可视化算法。 +3. **异常检测:** 异常检测是发现异常的事物、事件或观察,这些异常通过大大偏离正常数据引发怀疑。在这种情况下,系统被编程为有大量的典型案例。因此,当它检测到一个意外的事件时,它可以判断这是不是一个异常。 +信用卡欺诈检测就是一个很好的例子。这个问题现在正在使用无监督机器学习的异常检测方法来解决。为了防止欺诈,系统会识别出异常的信用卡交易。 + +### Semi-supervised machine learning +The disadvantage of supervised learning is that it requires hand-labelling by ML specialists or data scientists and requires a high cost to process. Unsupervised learning also has a limited spectrum for its applications. To overcome these drawbacks of supervised learning and unsupervised learning algorithms, the concept of Semi-supervised learning is introduced. Typically, this combination contains a very small amount of labelled data and a large amount of unlabelled data. The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised learning algorithm and then use the existing labelled data to label the rest of the unlabelled data. + +Practical applications of Semi-Supervised Learning – +1. **Speech Analysis:** Since labelling audio files is a very intensive task, Semi-Supervised learning is a very natural approach to solve this problem. +2. **Internet Content:** Classification: Labeling each webpage is an impractical and unfeasible process and thus uses Semi-Supervised learning algorithms. Even the Google search algorithm uses a variant of Semi-Supervised learning to rank the relevance of a webpage for a given query. +3. **Protein Sequence Classification:** Since DNA strands are typically very large, the rise of Semi-Supervised learning has been imminent in this field. + +### Reinforcement machine learning +This trains machines through trial and error to take the best action by establishing a reward system. Reinforcement learning can train models to play games or train autonomous vehicles to drive by telling the machine when it made the right decisions, which helps it learn over time what actions it should take. + +Practical applications of Reinforcement Learning - +1. **Production Systems** + e.g. Google Cloud AutoML, Facebook Horizon, Recommendation, advertisement, search +2. **Autonomous Driving** +3. **Business Management** + e.g. solving the vehicle routing problem, fraudulent behaviour in e-commerce, Concurrent reinforcement learning from customer interactions +4. **Recommender systems** + e.g. for search, recommendation, and online advertising + +### Machine learning is also associated with several other artificial intelligence subfields: + +### Natural language processing + +Natural language processing is a field of machine learning in which machines learn to understand natural language as spoken and written by humans instead of the data and numbers normally used to program computers. This allows machines to recognize the language, understand it, and respond to it, as well as create new text and translate between languages. Natural language processing enables familiar technology like chatbots and digital assistants like Siri or Alexa. + +Practical applications of NLP: +1. **Question Answering:** Question Answering focuses on building systems that automatically answer the questions asked by humans in a natural language. +2. **Spam Detection:** Spam detection is used to detect unwanted e-mails getting to a user's inbox. +3. **Sentiment Analysis:** Sentiment Analysis is also known as opinion mining. It is used on the web to analyze the attitude, behaviour, and emotional state of the sender. This application is implemented through a combination of NLP (Natural Language Processing) and statistics by assigning the values to the text (positive, negative, or natural) and identifying the mood of the context (happy, sad, angry, etc.) +4. **Machine Translation:** Machine translation is used to translate text or speech from one natural language to another natural language. e.g. Google Translate +5. **Spelling correction:** Microsoft Corporation provides word processor software like MS-word and PowerPoint for spelling correction. + +### Neural networks + +Neural networks are a commonly used, specific class of machine learning algorithms. Artificial neural networks are modelled on the human brain, in which thousands or millions of processing nodes are interconnected and organized into layers. + +In an artificial neural network, cells, or nodes, are connected, with each cell processing inputs and producing an output that is sent to other neurons. Labeled data moves through the nodes or cells, with each cell performing a different function. In a neural network trained to identify whether a picture contains a cat or not, the different nodes would assess the information and arrive at an output that indicates whether a picture features a cat. + +Practical applications of Neural Networks: +1. **Stock Market Prediction:** To make a successful stock prediction in real-time, a Multilayer Perceptron MLP (class of feedforward artificial intelligence algorithm) is employed. MLP comprises multiple layers of nodes, and each of these layers is fully connected to the succeeding nodes. Stock's past performances, annual returns, and non-profit ratios are considered for building the MLP model. +2. **Social Media:** Multi-layered Perceptrons forecast social media trends. It uses different training methods like Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Mean Squared Error (MSE). MLP takes into consideration several factors like the user's favourite Instagram pages, bookmarked choices, etc. Post analysis of individuals' behaviours via social media networks, the data can be linked to people's spending habits. MLP ANN is used to mine data from social media applications. +3. **Aerospace:** Aerospace Engineering is an expansive term that covers developments in spacecraft and aircraft. Fault diagnosis, high-performance auto-piloting, securing aircraft control systems, and modelling key dynamic simulations are some of the key areas that neural networks have taken over. Time delay Neural networks can be employed for modelling non-linear time dynamic systems. + +### Deep learning + +Deep learning networks are neural networks with many layers. The layered network can process extensive amounts of data and determine the “weight” of each link in the network — for example, in an image recognition system, some layers of the neural network might detect individual features of a face, like eyes, nose, or mouth, while another layer would be able to tell whether those features appear in a way that indicates a face. + +Practical applications of Deep Learning: +1. **Automatic Text Generation –** Corpus of text is learned, and from this model, new text is generated, word-by-word or character-by-character. Then this model is capable of learning how to spell, punctuate, and form sentences, or it may even capture the style. +2. **Healthcare –** Helps in diagnosing various diseases and treating them. +3. **Automatic Machine Translation –** Certain words, sentences, or phrases in one language are transformed into another language (Deep Learning is achieving top results in the areas of text and images). +4. **Image Recognition –** Recognizes and identifies peoples and objects in images as well as understands content and context. This area is already being used in Gaming, Retail, Tourism, etc. +5. **Predicting Earthquakes –** Teaches a computer to perform viscoelastic computations, which are used in predicting earthquakes. + +## [Web Technology](Web%20Technology/WebTechnology.md#web-tecnology) +Web Technology refers to the various tools and techniques that are utilized in the process of communication between different types of devices over the Internet. A web browser is used to access web pages. Web browsers can be defined as programs that display text, data, pictures, animation, and video on the Internet. Hyperlinked resources on the World Wide Web can be accessed using software interfaces provided by Web browsers. +### Web Technology can be classified into the following sections: +- World Wide Web (WWW) +The World Wide Web is based on several different technologies: Web browsers, Hypertext Markup Language (HTML), and Hypertext Transfer Protocol (HTTP). +- Web Browser +The web browser is an application software to explore www (World Wide Web). It provides an interface between the server and the client and requests to the server for web documents and services. +- Web Server +A web server is a program that processes the network requests of the users and serves them with files that create web pages. This exchange takes place using Hypertext Transfer Protocol (HTTP). +- Web Pages +A webpage is a digital document that is linked to the World Wide Web and viewable by anyone connected to the Internet who has a web browser. +- Web Development +Web development refers to the building, creating, and maintaining of websites. It includes aspects such as web design, web publishing, web programming, and database management. It is the creation of an application that works over the Internet, i.e., websites. +### Web Development can be classified into two ways: +### Frontend Development +The part of a website where the user interacts directly is termed the front end. It is also referred to as the ‘client side’ of the application. +### Backend Development +The backend is the server side of a website. It is part of the website that users cannot see and interact with. It is the portion of software that does not come in direct contact with the users. It is used to store and arrange data. + + +## [Networking](Networking/readme.md#networking) +A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies. + +The nodes of a computer network can include personal computers, servers, networking hardware, or other specialized or general-purpose hosts. They are identified by network addresses and may have hostnames. Hostnames serve as memorable labels for the nodes, rarely changed after the initial assignment. Network addresses serve for locating and identifying the nodes by communication protocols such as the Internet Protocol. + +Computer networks may be classified by many criteria, including the transmission medium used to carry signals, bandwidth, communications protocols to organize network traffic, the network size, the topology, traffic control mechanism, and organizational intent. + +### Types of networking +There are two primary types of computer networking: +- Wired networking: Wired networking requires the use of a physical medium for transport between nodes. Copper-based Ethernet cabling, popular due to its low cost and durability, is commonly used for digital communications in businesses and homes. Alternatively, optical fibre is used to transport data over greater distances and at faster speeds, but it has several tradeoffs, including higher costs and more fragile components. +- Wireless networking: Wireless networking uses radio waves to transport data over the air, enabling devices to be connected to a network without any cabling. Wireless LANs are the most well-known and widely deployed form of wireless networking. Alternatives include microwave, satellite, cellular, and Bluetooth, among others. +## OSI MODEL +OSI stands for **Open Systems Interconnection**. It was developed by ISO – ‘**International Organization for Standardization**‘in the year 1984. It is a 7-layer architecture with each layer having specific functionality to perform. All these seven layers work collaboratively to transmit the data from one person to another across the globe. + +#### **1\. Physical Layer (Layer 1):** + +The lowest layer of the OSI reference model is the physical layer. It is responsible for the actual physical connection between the devices. The physical layer contains information in the form of **bits.** It is responsible for transmitting individual bits from one node to the next. When receiving data, this layer will get the signal received and convert it into 0s and 1s and send them to the Data Link layer, which will put the frame back together. + +![](Networking/OSI%20Model/img/computer-network-osi-model-layers-bits.png) + +The functions of the physical layer are as follows: + +1. **Bit synchronization:** The physical layer provides the synchronization of the bits by providing a clock. This clock controls both sender and receiver thus providing synchronization at the bit level. +2. **Bit rate control:** The Physical layer also defines the transmission rate, i.e., the number of bits sent per second. +3. **Physical topologies:** Physical layer specifies how the different devices/nodes are arranged in a network, i.e., bus, star, or mesh topology. +4. **Transmission mode:** Physical layer also defines how the data flows between the two connected devices. The various transmission modes possible are Simplex, half-duplex and full-duplex. + +#### **2\. Data Link Layer (DLL) (Layer 2):** + +The data link layer is responsible for the node-to-node delivery of the message. The main function of this layer is to make sure data transfer is error-free from one node to another over the physical layer. When a packet arrives in a network, it is the responsibility of the DLL to transmit it to the host using its MAC address. +The Data Link Layer is divided into two sublayers: + +1. Logical Link Control (LLC) +2. Media Access Control (MAC) + +The packet received from the Network layer is further divided into frames depending on the frame size of the NIC(Network Interface Card). DLL also encapsulates the Sender and Receiver’s MAC address in the header. + +The Receiver’s MAC address is obtained by placing an ARP(Address Resolution Protocol) request onto the wire asking, “Who has that IP address?” and the destination host will reply with its MAC address. + +![](Networking/OSI%20Model/img/computer-network-osi-model-layers-framing.png) + +The functions of the Data Link layer are : + +1. **Framing:** Framing is a function of the data link layer. It provides a way for a sender to transmit a set of bits that are meaningful to the receiver. This can be accomplished by attaching special bit patterns to the beginning and end of the frame. +2. **Physical Addressing:** After creating frames, the Data link layer adds physical addresses (MAC addresses) of the sender and/or receiver in the header of each frame. +3. **Error control:** Data link layer provides the mechanism of error control in which it detects and retransmits damaged or lost frames. +4. **Flow Control:** The data rate must be constant on both sides or else the data may get corrupted; thus, flow control coordinates the amount of data that can be sent before receiving an acknowledgement. +5. **Access control:** When a single communication channel is shared by multiple devices, the MAC sub-layer of the data link layer helps to determine which device has control over the channel at a given time. + +#### **3\. Network Layer (Layer 3):** + +The network layer works for the transmission of data from one host to the other located in different networks. It also takes care of packet routing, i.e., the selection of the shortest path to transmit the packet from the number of routes available. The sender & receiver’s IP addresses are placed in the header by the network layer. + +The functions of the Network layer are : + +1. **Routing:** The network layer protocols determine which route is suitable from source to destination. This function of the network layer is known as routing. +2. **Logical Addressing:** To identify each device on internetwork uniquely, the network layer defines an addressing scheme. The sender & receiver’s IP addresses are placed in the header by the network layer. Such an address distinguishes each device uniquely and universally. + + +## [Internet](Internet/readme.md#internet) +The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite ([TCP/IP](Networking/readme.md#tcptransmission-control-protocol)) to serve billions of users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and government networks of local to global scope that is linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries an extensive range of information resources and services, such as the interlinked hypertext documents and applications of the World Wide Web ([WWW](Internet/readme.md#world-wide-web-www)) and the infrastructure to support email. + +### [World Wide Web (WWW)](Internet/readme.md#world-wide-web-www) +The World Wide Web (WWW) is an information space where documents and other web resources are identified by Uniform Resource Locators (URLs), interlinked by hypertext links, and accessible via the Internet. English scientist Tim Berners-Lee invented the World Wide Web in 1989. He wrote the first web browser in 1990 while employed at CERN in Switzerland. The browser was released outside CERN in 1991, first to other research institutions starting in January 1991 and to the general public on the Internet in August 1991. + +### [Internet Protocol (IP)](Internet/readme.md#internet-protocol-ip) +The Internet Protocol (IP) is a protocol, or set of rules, for routing and addressing packets of data so that they can travel across networks and arrive at the correct destination. Data traversing the Internet is divided into smaller pieces called packets. + +## [DBMS]() + +What is a Database? +------------------- + +A database is a collection of related data that represents some aspect of the real world. A database system is designed to be built and populated with data for a certain task. + +What is DBMS? +------------- + +**Database Management System (DBMS)** is software for storing and retrieving users' data while considering appropriate security measures. It consists of a group of programs that manipulate the database. The DBMS accepts the request for data from an application and instructs the operating system to provide the specific data. In large systems, a DBMS helps users and other third-party software store and retrieve data. + +DBMS allows users to create their databases as per their requirements. The term "DBMS" includes the use of a database and other application programs. It provides an interface between the data and the software application. + +Example of a DBMS +----------------- + +Let us see a simple example of a university database. This database maintains information concerning students, courses, and grades in a university environment. The database is organized into five files: + +* The STUDENT file stores the data of each student +* The COURSE file stores contain data on each course. +* The SECTION stores the information about sections in a particular course. +* The GRADE file stores the grades which students receive in the various sections +* The TUTOR file contains information about each professor. + +To define DBMS: + +* We need to specify the structure of the records of each file by defining the different types of data elements to be stored in each record. +* We can also use a coding scheme to represent the values of a data item. +* Basically, your database will have five tables with a foreign key defined amongst the various tables. + +History of DBMS +--------------- + +Here are the important landmarks from history: + +* 1960 – Charles Bachman designed the first DBMS system +* 1970 – Codd introduced IBM'S Information Management System (IMS) +* 1976- Peter Chen coined and defined the Entity-relationship model, also known as the ER model +* 1980 – Relational Model becomes a widely accepted database component +* 1985- Object-oriented DBMS develops. +* 1990s- Incorporation of object orientation in relational DBMS. +* 1991- Microsoft ships MS access, a personal DBMS that displaces all other personal DBMS products. +* 1995: First Internet database applications +* 1997: XML applied to database processing. Many vendors begin to integrate XML into DBMS products. + +Characteristics of DBMS +----------------------- + +Here are the characteristics and properties of a Database Management System: + +* Provides security and removes redundancy +* Self-describing the nature of a database system +* Insulation between programs and data abstraction +* Support of multiple views of the data +* Sharing of data and multi-user transaction processing +* Database Management Software allows entities and relations among them to form tables. +* It follows the ACID concept ( Atomicity, Consistency, Isolation, and Durability). +* DBMS supports a multi-user environment that allows users to access and manipulate data in parallel. + +Popular DBMS Software +--------------------- + +Here is the list of some popular DBMS systems: + +* MySQL +* Microsoft Access +* Oracle +* PostgreSQL +* dBASE +* FoxPro +* SQLite +* IBM DB2 +* LibreOffice Base +* MariaDB +* Microsoft SQL Server etc. + +## [Cryptography](Cryptography/readme.md#cryptography) +Cryptography is a technique to secure data and communication. It is a method of protecting information and communications through the use of codes so that only those for whom the information is intended can read and process it. Cryptography is used to protect data in transit, at rest, and in use. The prefix _crypt_ means "hidden" or "secret", and the suffix _graphy_ means "writing". + +### Types of Cryptography +There are two types of cryptography: +1. [Symmetric Cryptography](Cryptography/readme.md#symmetric-cryptography) +2. [Asymmetric Cryptography](Cryptography/readme.md#asymmetric-cryptography) + +### [Crypto Currency](Cryptography/CryptoCurrency/readme.md#crypto-currency) +Cryptocurrency is a digital currency in which encryption techniques are used to regulate the generation of units of currency and verify the transfer of funds, operating independently of a central bank. Cryptocurrencies use decentralized control as opposed to centralized digital currency and central banking systems. The decentralized control of each cryptocurrency works through distributed ledger technology, typically a blockchain, that serves as a public financial transaction database. A defining feature of a cryptocurrency, and arguably its most endearing allure, is its organic nature; it is not issued by any central authority, rendering it theoretically immune to government interference or manipulation. + +## Types of Crypto Currency are as follows: +1. [Proof of Work](Cryptography/CryptoCurrency/ProofOfWork/readme.md#proof-of-work) +2. [Proof of Stake](Cryptography/CryptoCurrency/ProofOfStake/readme.md#proof-of-stake) + + + +### _Most Popular Crypto Currencies are as follows:_ +1. [Bitcoin](Cryptography/CryptoCurrency/ProofOfWork/Bitcoin/readme.md#bitcoin) +2. [Ethereum](Cryptography/CryptoCurrency/ProofOfStake/Ethereum/readme.md#ethereum) +3. [Litecoin](Cryptography/CryptoCurrency/ProofOfWork/Litecoin/readme.md#litecoin) +4. [Cardano](Cryptography/CryptoCurrency/ProofOfStake/Cardano/readme.md#cardano) +5. [Dogecoin](Cryptography/CryptoCurrency/ProofOfWork/Dogecoin/readme.md#dogecoin) + + + +## Theory of Computation +In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation using an algorithm, how efficiently they can be solved, or to what degree (e.g., approximate solutions versus precise ones). The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?". + +### Automata Theory +Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science. The word automata comes from the Greek word αὐτόματος, which means "self-acting, self-willed, self-moving". An automaton (automata in plural) is an abstract self-propelled computing device that follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite-State Machine (FSM). The figure on the right illustrates a finite-state machine, which is a well-known type of automaton. This automaton consists of states (represented in the figure by circles) and transitions (represented by arrows). As the automaton sees a symbol of input, it makes a transition (or jump) to another state, according to its transition function, which takes the previous state and current input symbol as its arguments. + + +### Formal Languages +In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. + +The alphabet of a formal language consists of symbols, letters, or tokens that concatenate into strings of the language. Each string concatenated from symbols of this alphabet is called a word, and the words that belong to a particular formal language are sometimes called well-formed words or well-formed formulas. A formal language is often defined using formal grammar, such as regular grammar or context-free grammar, which consists of its formation rules. + +In computer science, formal languages are used, among others, as the basis for defining the grammar of programming languages and formalized versions of subsets of natural languages in which the words of the language represent concepts that are associated with particular meanings or semantics. In computational complexity theory, decision problems are typically defined as formal languages and complexity classes are defined as the sets of formal languages that can be parsed by machines with limited computational power. In logic and the foundations of mathematics, formal languages are used to represent the syntax of axiomatic systems, and mathematical formalism is the philosophy that all mathematics can be reduced to the syntactic manipulation of formal languages in this way. + +### Computability Theory +Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since expanded to include the study of generalized computability and definability. In these areas, computability theory overlaps with the proof theory and effective descriptive set theory. + +### Computational complexity theory +In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by a mechanical application of mathematical steps, such as an algorithm. + +A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition by introducing mathematical models of computation to study these problems and quantifying their computational complexity, i.e., the number of resources needed to solve them, such as time and storage. Other measures of complexity are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity), and the number of processors (used in parallel computing). One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do. The P versus NP problem, one of the seven Millennium Prize Problems, is dedicated to the field of computational complexity. + +Closely related fields in theoretical computer science are the analysis of algorithms and computability theory. A key distinction between the analysis of algorithms and computational complexity theory is that the former is devoted to analyzing the number of resources needed by a particular algorithm to solve a problem, whereas the latter asks a more general question about all possible algorithms that could be used to solve the same problem. More precisely, computational complexity theory tries to classify problems that can or cannot be solved with appropriately restricted resources. In turn, imposing restrictions on the available resources is what distinguishes computational complexity from computability theory: the latter theory asks what kinds of problems can, in principle, be solved algorithmically. + + + +## Contributors + + + + +